Post on 07-Apr-2017
LINKED OPEN DATA FOR CULTURAL HERITAGEVLADIMIR ALEXIEV
VLADIMIRALEXIEVONTOTEXTCOM
2016-09-29
2D presentation O for overview H for help normal continuous HTML
TABLE OF CONTENTS1 Intro
11 GLAM vs Internet12 Google NGrams Phrases in Books13 Google NGrams Two Speci c Orgs14 Google Trends Search Popularity15 How To Survive in the Internet Age16 Why Linked Open Data (LOD) is Important
2 GLAM Content Standards21 Museum Content Standards22 Archival Content Standards23 Library Content Standards
3 GLAM Metadata Schemas31 Seeing Standards (2)32 XML Schemas33 Museum Metadata CDWA34 Archive Metadata35 Library Metadata MARC
4 GLAM Ontologies41 Europeana Data Model42 CIDOC CRM43 Web Annotation (Open Annotation OA)44 International Image Interop Framework (IIIF)45 Library Ontologies46 Archival Ontologies
5 GLAM LOD Datasets (LODLAM)51 Wikidata52 VIAF53 Global Authority Control
6 LODLAM Projects
1 INTROA bit about me co-founder of Sirma Group Holding Bulgarias largest software groupand parent company of Ontotext
30y in IT 8 at university 22 in industryDid plenty of project management business analysis and data modeling some bigprojects tooLast 8 years focused on data modeling and integrationLast 6 years in paricular focused on semantic data and semantic integration
I love to poke in other peoples data and get in-depth So theres a lot about data inthese slidesSee you can sort by type and keyword full abstracts are available
Ive provided a few references below but if a topic interests you please search inthe publications
The shorter version has about 110 slides so sit back relax and enjoy the ride Shouldtake us 120h
Ask questions at any time in the chat Ill answer them all at the endThis longer version has 130 slides including info about Library metadata andontologies
My publications
11 GLAM VS INTERNETGLAM CH DH
Cultural Heritage (CH) the sum of our non-economic heritageObvious implications to economically signi cant sectors eg tourismSome say its the source of all creativity would you agreeIncludes old and new (eg digitally-born) material and immaterial tangible andintangible permanent and temporal (eg interactive installations)
Galleries Libraries Archives Museums (GLAM) sisterhood of institutions that carefor our CH each with its own perspective and prioritiesDigital Humanities (DH) the use of computers in the humanities
Eg some UK universities with DH programs KingsDH UCLDH DH_OUCamDigHum
12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible
13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time
14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum
15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom
Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium
To survive GLAMs must adopt the internet as their default modus operandi
Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches
16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
TABLE OF CONTENTS1 Intro
11 GLAM vs Internet12 Google NGrams Phrases in Books13 Google NGrams Two Speci c Orgs14 Google Trends Search Popularity15 How To Survive in the Internet Age16 Why Linked Open Data (LOD) is Important
2 GLAM Content Standards21 Museum Content Standards22 Archival Content Standards23 Library Content Standards
3 GLAM Metadata Schemas31 Seeing Standards (2)32 XML Schemas33 Museum Metadata CDWA34 Archive Metadata35 Library Metadata MARC
4 GLAM Ontologies41 Europeana Data Model42 CIDOC CRM43 Web Annotation (Open Annotation OA)44 International Image Interop Framework (IIIF)45 Library Ontologies46 Archival Ontologies
5 GLAM LOD Datasets (LODLAM)51 Wikidata52 VIAF53 Global Authority Control
6 LODLAM Projects
1 INTROA bit about me co-founder of Sirma Group Holding Bulgarias largest software groupand parent company of Ontotext
30y in IT 8 at university 22 in industryDid plenty of project management business analysis and data modeling some bigprojects tooLast 8 years focused on data modeling and integrationLast 6 years in paricular focused on semantic data and semantic integration
I love to poke in other peoples data and get in-depth So theres a lot about data inthese slidesSee you can sort by type and keyword full abstracts are available
Ive provided a few references below but if a topic interests you please search inthe publications
The shorter version has about 110 slides so sit back relax and enjoy the ride Shouldtake us 120h
Ask questions at any time in the chat Ill answer them all at the endThis longer version has 130 slides including info about Library metadata andontologies
My publications
11 GLAM VS INTERNETGLAM CH DH
Cultural Heritage (CH) the sum of our non-economic heritageObvious implications to economically signi cant sectors eg tourismSome say its the source of all creativity would you agreeIncludes old and new (eg digitally-born) material and immaterial tangible andintangible permanent and temporal (eg interactive installations)
Galleries Libraries Archives Museums (GLAM) sisterhood of institutions that carefor our CH each with its own perspective and prioritiesDigital Humanities (DH) the use of computers in the humanities
Eg some UK universities with DH programs KingsDH UCLDH DH_OUCamDigHum
12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible
13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time
14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum
15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom
Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium
To survive GLAMs must adopt the internet as their default modus operandi
Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches
16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
1 INTROA bit about me co-founder of Sirma Group Holding Bulgarias largest software groupand parent company of Ontotext
30y in IT 8 at university 22 in industryDid plenty of project management business analysis and data modeling some bigprojects tooLast 8 years focused on data modeling and integrationLast 6 years in paricular focused on semantic data and semantic integration
I love to poke in other peoples data and get in-depth So theres a lot about data inthese slidesSee you can sort by type and keyword full abstracts are available
Ive provided a few references below but if a topic interests you please search inthe publications
The shorter version has about 110 slides so sit back relax and enjoy the ride Shouldtake us 120h
Ask questions at any time in the chat Ill answer them all at the endThis longer version has 130 slides including info about Library metadata andontologies
My publications
11 GLAM VS INTERNETGLAM CH DH
Cultural Heritage (CH) the sum of our non-economic heritageObvious implications to economically signi cant sectors eg tourismSome say its the source of all creativity would you agreeIncludes old and new (eg digitally-born) material and immaterial tangible andintangible permanent and temporal (eg interactive installations)
Galleries Libraries Archives Museums (GLAM) sisterhood of institutions that carefor our CH each with its own perspective and prioritiesDigital Humanities (DH) the use of computers in the humanities
Eg some UK universities with DH programs KingsDH UCLDH DH_OUCamDigHum
12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible
13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time
14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum
15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom
Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium
To survive GLAMs must adopt the internet as their default modus operandi
Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches
16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
11 GLAM VS INTERNETGLAM CH DH
Cultural Heritage (CH) the sum of our non-economic heritageObvious implications to economically signi cant sectors eg tourismSome say its the source of all creativity would you agreeIncludes old and new (eg digitally-born) material and immaterial tangible andintangible permanent and temporal (eg interactive installations)
Galleries Libraries Archives Museums (GLAM) sisterhood of institutions that carefor our CH each with its own perspective and prioritiesDigital Humanities (DH) the use of computers in the humanities
Eg some UK universities with DH programs KingsDH UCLDH DH_OUCamDigHum
12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible
13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time
14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum
15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom
Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium
To survive GLAMs must adopt the internet as their default modus operandi
Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches
16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
12 GOOGLE NGRAMS PHRASES IN BOOKSSearch for library museum vs Google Facebook Twitter in books the web sites arenegligible
13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time
14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum
15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom
Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium
To survive GLAMs must adopt the internet as their default modus operandi
Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches
16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
13 GOOGLE NGRAMS TWO SPECIFIC ORGSCompare two speci c orgs Facebook is more popular in recent books compared toBritish Museum over time
14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum
15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom
Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium
To survive GLAMs must adopt the internet as their default modus operandi
Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches
16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
14 GOOGLE TRENDS SEARCH POPULARITYWeb searches over the last 12 years Facebook Google are much more popular thanlibrary museum
15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom
Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium
To survive GLAMs must adopt the internet as their default modus operandi
Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches
16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
15 HOW TO SURVIVE IN THE INTERNET AGESince ancient times GLAMs have been the centers of knowledge and wisdom
Arenrsquot Google Wikipedia Facebook Twitter and smart-phone apps becoming thenew centers of research and culture (or at least popular culture)Will GLAMs fall victims to teenagers with smartphones browsing Facebook If thelibrarys attitude is Come search in our OPAC then certainly yesHow to preserve the role of GLAMs into the new millennium
To survive GLAMs must adopt the internet as their default modus operandi
Web 10 presentationWeb 20 interactionWeb 30 (semantic web) data linking enrichingdisambiguating text using NLPIEapproaches
16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
16 WHY LINKED OPEN DATA (LOD) IS IMPORTANTCulture is naturally cross-institutional cross-border multilingual and interlinkedLOD allows making connections between (and making sense of) the multitude ofdigitized cultural artifacts available on the netLOD enables large-scale Digital Humanities research collaboration and aggregationtechnological renewal of CH institutions
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
2 GLAM CONTENT STANDARDSGLAM data is complex and varied
Exception is the ruleMany metadata format variationsData comes from a variety of systems
Thus professional organizations have found it useful to de ne content standards
Describe what data to capture (and sometimes how to go about it)Before formalizing how to express it in machine-readable form
Examples are extremely useful for data modelers to decide how to map the data
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
21 MUSEUM CONTENT STANDARDS content standard for art architecture museumsCataloging Cultural Objects
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
211 CCO EXAMPLE ARTWORK AND CREATOR RECORD
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
212 CCO EXAMPLE HIERARCHICAL LINK BETWEEN 2 ARTWORKS
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
213 CCO EXAMPLE CREATOR EXTENT
How to describe one aspect of the data
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
214 SPECTRUM
UK Museum Collections Management Standard
De nes procedures for museums to follow and the attendant dataCovers 21 procedures Pre-entry Object entry Loans in Acquisition Inventorycontrol Location and movement control Transport Cataloguing Object conditionchecking and technical assessment Conservation and collections care Riskmanagement Insurance and indemnity management Valuation control Audit Rightsmanagement Use of collections Object exit Loans out Loss and damageDeaccession and disposal Retrospective documentationAddresses accreditation
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
215 SPECTRUM EXAMPLE OBJECT ENTRY
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
22 ARCHIVAL CONTENT STANDARDSISAD(G) archival materialsISAAR(CPF) agents (corporations people families)ISDF functions (eg Secretary of some society)ISDIAH archival holding institutions
Image by DPitti 2015
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
23 LIBRARY CONTENT STANDARDSAACR2 (Anglo-American Cataloging Rules 2)International Standard Bibliographic Description (ISBD)Resource Description and Access (RDA)
Extremely detailed and comprehensive (see RDA later) But sometimes pay moreattention where to put the commas than to
Data sharingGlobal availability of resourcesSharing the cataloging burden
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
231 FRBR FRSAD FRAD
Functional Requirements for Bibliographic Records (FRBR) Subject Authority Data(FRSAD) Authority Data (FRAD) (JMitchell MZeng MZumer 2011)
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
232 FRBR
Starts from user tasks ( nd identify select obtain explore) Introduces the important 4-level WEMI model (relates to Uniform Titles)
Work original or derived intellectual work (eg Don Quixote)Expression translation or edition (eg Don Quixote translation to English)Manifestation publishers work (eg with illustrations foreword by compilationhellip)ISBNs are hereItem physical copy libraries track loanavailability famous copies (eg Lincolns Bible)manuscripts are singleton items
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
233 FRSAD
Anything can be subject (thema) referred to by various namestitles (nomen)
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
234 FRBR-LRM
FRBR-Library Reference Model (PRiva PLe Bœuf MŽumer Draft for World-WideReview 2016-02) Merges the previous standards
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
3 GLAM METADATA SCHEMASHow many of the standards listed in
apply to your work (by Jenn Riley Associate Dean for Digital Initiatives atMcGill University Library)
Seeing Standards A Visualization of the MetadataUniverse
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
31 SEEING STANDARDS (2)
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
32 XML SCHEMASDo you deal with XML I bet you do
XML Schema (XSD) most widely used but most unwieldyRelaxNG (RNG) new generation schema languageRNG Compact (RNC) non-XML notation most readable Eg EAD3 is mastered inRNC then RNG and XSD producedSchematron express rules in XPath that cant be captured in XSDRNGRNC (egcross- eld validation)
Tools
patch the jing RNG validator toemit errors like Schematron (SVRL with XPath error location)
RNC tools and CH schemas in RNC Emacswith code highlighting and syntax checking ( ycheck)
httpsgithubcomEHRIjing-trangtreeEHRI-176
httpsgithubcomVladimirAlexievrnc
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
33 MUSEUM METADATA CDWACategories for the Description of Works of Art (CDWA) realization of CCO 532categories (data elements)
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
331 CDWA LITE
XML schema implementing part of CDWA Moderate complexity about 300 elementsDisplay vs Indexing (structured) elements eg for Dimension
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
332 CONA SCHEMA
Cultural Objects Name Authority (CONA) Getty museum data aggregation Moderatecomplexity about 280 elements
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
333 SPECTRUM XML
has 10 entities and 592 elds of which 490 are Object(artwork) elds I am not aware of any systems producing thisSPECTRUM Schema 40b
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
334 LIDO
Lightweight Information Describing Objects (LIDO) Evolved from CDWA museumdatwith inspiration from CIDOC CRM (Images by RStein and AVitzthum ATHENAworkshop 2010)
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
335 LIDO SCHEMA
Complex schema eg when referring to a related object you can provide almost asmuch detail as for the main object Could leverage opportunities for linking moreDisplay vs Indexing (structured) elements inherited from CDWA
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
34 ARCHIVE METADATAEAD Encoded Archival Description Describes archival materials (documentary units)EACCPF Encoded Archival Context Corporations Persons FamiliesEAG Encoded Archival Guide Describes institutions
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
341 ARCHIVE METADATA PROBLEMS
Pay a lot of attention to presentation not enough to linking (dif cult to semanticize)Emphasis on documents not historic agents and events
EAG So-called controlled access points are text and typically not controlled at allEAC Many institutions dont consider EAC very valuable and instead put person infoin EADs element (example below from EADiva)EAC Related persons are names (strings) not links (things)EAC Events include lots of info but only Date is separate eld (person names could betagged but often are not)EAC Family tree modeled as Outline thats also used for other purposes (justpresentation)
bioghist
ltbioghistgt ltheadgtChronological Eventsltheadgt ltchronlistgt ltchronitemgt ltdate normal=19781028gtOctober 28 1978ltdategt lteventgt ltpersname normal=Wossname SamuelgtSam Wossnameltpersnamegt succeeds ltpersname normal=Othername JohngtJohn Othernameltpersnamegt as department head lteventgt ltchronitemgt ltchronitemgt ltdate normal=19790315gtMarch 15 1979ltdategt lteventgtDepartmental reorganizationlteventgt ltchronitemgt ltchronlistgt ltbioghistgt
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
35 LIBRARY METADATA MARCMARC is 50 years old unreadable and doesnt accommodate new FRBR principlesMARC-XML is not much better
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
351 MARC MUST DIE
A whole emotional subculture based on a slogan by Roy Fielding 2002
MARC is dead (is it really) in-depth discussion wiki
marc-must-dieinfoFutureLibFacebook group
by Sally Chambers ELAG 2011Presentation
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
4 GLAM ONTOLOGIESWhy do they call conversion to RDF lifting and back to some other format lowering
RDF is a simple abstracted data modelDoesnt have nesting biases like XML whether a sub-element is nested or referencedby ID Has less syntactic idiosyncrasies(RDFXML is awful but there is Turtle for readability or JSONLD for programmerconvenience)The model is self-describing in a distributed way if a classproperty is looked upshould return description and info
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
41 EUROPEANA DATA MODELModel used by the Europeana aggregator (53M objects) and adopted by Digital PublicLibrary of America (DPLA) Based on
OAI ORE (Open Archives Initiative Object Reuse amp Exchange) organizing objectmetadata and digital representations (WebResources)Dublin Core descriptive metadataSKOS (Simple Knowledge Organization System) conceptual objects (conceptsagents etc)CIDOC-CRM inspired events some relations between objects
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
411 EDM SEMANTIC GRAPH
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
412 EDM ISSUESCONSIDERATIONS
Criticized that its not expressive enough Eg cant capture the speci c contribution ofan artist to artworkComplication splits info about an object
EDM External (form provider) edmProvidedCHO and oreAggregationEDM Internal (at Europeana) edmProvidedCHO and 2 ltoreAggregationoreProxygt pairs
Many providers use the minimal features and make mistakes Europeana didnt do alot of validation
Old objects retro-converted from ESE are poor (only text) though someenrichments added by Europeana
formed to push this strategic point (2015-2020)Europeana Data Quality Committee
Evolving speci cation (since 2009)
Currently considering actual implementation of EventsExtensions for manuscripts music fashion etc
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
42 CIDOC CRM comprehensive reference model used for history historic events
archaeology museum data etc by CIDOC (ICOM documentation committee)Standardized as ISO 211272014 still evolving About 85 classes fundamentalbranches Persistent (endurant) vs Temporal (perdurant) Physical vs Conceptual
CIDOC CRM
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
421 CIDOC CRM PROPERTIES
Classes represent abstract things (eg crmE24_Physical_Man-Made_Thing) speci cthings (eg Paintings Coins) are accommodated with crmP2_has_type 135 props (plustheir inverses) prop hierarchy (see - - - at bottom)
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
422 CIDOC GRAPHICAL EXAMPLES
(or including Kindle) (or including Kindle) essential to
understand how to apply CRM in various situationsTypical modeling construct short-cut (crmP43_has_dimension) vs long-path (egcrmP39i_was_measured_bycrmP40_observed_dimension) which allows moredetails
Video Tutorial HTML versionGraphical Representation continuous HTML version
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
43 WEB ANNOTATION (OPEN ANNOTATION OA) mark annotate relate any web resources eg Webpage and bookmark Image
and region over it Document and translation Paragraph and commentary Diagram of from spec (using my rdfpuml)
W3C TR
Complete Example
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
44 INTERNATIONAL IMAGE INTEROP FRAMEWORK (IIIF)Standard API for DeepZoom (hi-res) images Supported by many servers and viewershttpiiifio
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
441 IIIF PRESENTATION API
Based on OA and SharedCanvas Strong attention to JSONLD representation(convenient for developers) Allows to assemble manuscripts from pieces present foliosetc etc See eg Rob Sanderson presentations IIIF and JSONLD
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
45 LIBRARY ONTOLOGIESWar of the Bibliographic Ontologies
BIBO used for a long time pragmaicFRBRer pragmatic realization of FRBR but little uptake (not rich enough)FRBRoo based on CIDOC CRM perhaps too complexFabio Cito Doco and friends modern includes new features (eg citation intent)BibFrame sponsored by LoC but for modeling mistakesRDAregistryinfo basic FRBR classes numerous properties for all kinds of thingsUsed for 100M records at TELSchemaBibEx ( ) steps on a clean model sponsored by the big 4search engines (Google MS Bing Yahoo Yandexru) Developed by OCLC May end upbeing used for 300M records at WorldCat
soundly criticized
httpbibschemaorg
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
451 RDAREGISTRY
Resource Description and Access (RDA) Registry info is well organized
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
452 RDAREGISTRY PROPERTIES
Many props (306 for Work alone) for speci c purposes (eg apellee for court decisionsgranting institution for academic theses) Numeric prop names but lexical (naturallanguage) also supported Serves many semantic formats
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
453 A TASTE OF FRBROO
Task Force asked what to add to EDM to better tFRBRooEDMndashFRBRoo Application Pro le
TF members developed a number of examples eg on publications of Don Quixote(TAalberg VAlexiev JWalkowska)
EDM variant
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
4531 A TASTE OF FRBROO
Simpler FRBRoo variant
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
4532 A TASTE OF FRBROO
More complex FRBRoo variant
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
454 FRBR-INSPIRED
FRBR Before and After by KCoyle (ALA 2016) is an in-depth look at FRBR-inspiredmodelsrealizationsChapter 10 describes the following ontologies FRBRer FRBRcore FaBiO ltindecsgtBIBFRAME RDA in RDF webFRBRer FRBRooMistakes have been made KCoyle SWIB 2015
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
455 BRITISH LIBRARY DATA MODEL
Pragmatic data model that reuses several ontologies and adds own props
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
456 FIRST LIBRARY THAT RUNS ON RDF
Oslo Public Library ( since 2014) uses Koha open sourcesoftware RDF in the core and rdf2marc conversions Pragmatic data modelthat reuses several ontologies and adds own props Enables a number of agile apps egsearch related books on Kiosk
httpdatadeichmannomarc2rdf
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
4561 OSLO PUBLIC LIBRARY DATAd_restnr_749919 rdftype biboDocument fabioManifestation dctitle About time dtitleURLized about_time fabiohasSubtitle Einsteins unfinished revolution ctagtagged d_keywordimaginary d_keyworddilation d_keywordtime d_keywordtidsreiser d_keywordtidsdilatasjon foafdepiction lthttpcoversopenlibraryorgbid96714‐Mjpggt lthttpcoversopenlibraryorgbid96715‐Mjpggt lthttpwwwbokkildennoSamboWebservletVisBildeServletproduktId=81081gt owlsameAs lthttppurlorgNETbookisbn0140174613bookgt lthttpwww4wiwissfu‐berlindebookmashupbooks0140174613gt dclanguage lexvoeng dbibliofilID 931138 dcformat lthttpdatadeichmannoformatBookgt dlocation_signature Dav dcpublisher d_orgpenguin bibonumPages 316 dphysicalDescription fig dbibsubject d_subjecteinstein_albert d_subjecttid_metafysikk fabioisManifestationOf d_workx24918900_about_time dsignatureNote 07x0619gq dbindingInfo lthttpdatadeichmannobindingInfohgt dbsID 0181541 dcdescription Bibliografi s 293‐294no dpriceInfo Nkr 17000 foafisPrimaryTopicOf lthttpwwwgoodreadscombookshow286461gt lthttpwwwlibrarythingcomwork23493gt dcidentifier 749919 ddewey 115 53011 dlocation_dewey 53011 biboisbn 9780140174618 0140174613
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
46 ARCHIVAL ONTOLOGIES3 attempts to represent EAD as RDF but IMHO neither is very good
Eg The Semantic Mapping of Archival Metadata to the CIDOC CRM Ontology(Journal of Archival Organization 9174ndash207 2011) proposes to represent the EADlevels hierarchy (from Fonds down to Items) as ve parallel CRM hierarchies
Records in Context (RiC) new upcoming semantic standard by ICA
Addresses the scope of EAD EAC EAG in one framework Inspired by nationalstandards FRBR (FRBR-LRM) CIDOC CRM
(2015) 10 (Sep 2016) Document key components of archival description
properties of each relations between themOntology after nalizing the Conceptual Model Expressed in OWL will includesemantic mapping to similar concepts developed by related communities
Progress report Mlist for commentsConceptual Model
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
461 RIC SAMPLE NETWORK
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
5 GLAM LOD DATASETS (LODLAM)Some established thesauri and gazetteers as LOD some are interconnectedDBPedia Wikidata VIAF FAST ULAN GeoNames Pleiades TGN LCSH AATIconClass Joconde SVCN Wordnet etcNot shown large collection LODs like Europeana (EDM) British Museum (CIDOCCRM) YCBA (CIDOC CRM) Rijksmuseum (EDM)(Diagram based on work by MHildebrand)
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
51 WIKIDATATons of info on everything including GLAMs artists artworks etc Eg Frans Hals onReasonator
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
511 WIKIDATA GENEALOGY
Family tree of Barack Obama
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
512 SUM OF ALL PAINTINGS
Data used forWikidata Project Sum of All Paintings
Works by painter across collections (catalogue raisonneacute) Eg Frans Hals
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
513 CROTOS
Excellent image search Shows links to WD Wikimedia Commons original website EgFrans Hals on Crotos
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
514 YOU CAN HELP TOO
(99k of 140k) Important because ltcollectioninventory numbergt is used to identify the painting Eg (1k) (2)Hunting for missing inventory numbers
US Getty Museum
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
515 LETS FIX THE SECOND ONE
like thisFind it on Gettys site add the info
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
516 HISTROPEDIA
Timelines of everyting Eg paintings by Leonardo
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
52 VIAFVirtual International Authority File 20 national libraries 10 other contributorsincluding Getty ULAN and Wikidata Eg coreferencing cluster of Spinoza
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
521 VIAF VS WIKIDATA (2015)
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
53 GLOBAL AUTHORITY CONTROL201307 Wikimania 2013201501 (initiated by Ontotext)201503 study for Europeana ofdatasets including PersonOrganization names Conclusions
The best datasets to use for name enrichment are VIAF and WikidataThere are few name forms in common between the library-tradition datasets(dominated by VIAF) and the LOD-tradition datasets (dominated by Wikidata)VIAF has more name variations and permutations Wikidata has more multilingualnames (translations)VIAF is much bigger 35M personsorgs Wikidata has 27M persons and maybe1M orgsOnly 05M of Wikidata personsorgs are coreferenced to VIAF with maybeanother 05M coreferenced to other datasets either VIAF-constituent (eg GND)or non-constituent (eg RKDartists)A lot can be gained by leveraging coreferencing across VIAF and WikidataWikidata has great tools for crowd-sourced coreferencing
Authority Addicts The New Frontier of Authority Control on Wikidata
Wikidata Project Authority ControlName Data Sources for Semantic Enrichment
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
531 NAMES OF LUCAS CRANACH
in 7 LOD datasets (Wikidata Freebase DBpediaYago VIAF ISNI ULAN)Analyzed records of Lucas Cranach
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
532 WIKIDATA COREFERENCING CAN ENLARGE VIAF
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
533 MIX-N-MATCH
A global Authority on everything librarians dream come true is acollaborative tool to create coreferences 234 authorities including Getty AAT TGNULAN RKD artists works LoC Authorities VIAF (not in M-n-M but on WD) BMpersons BBC YourPaintings Artsy etc etc
Mix-n-Match
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
5331 YOU CAN HELP WITH AUTHORITIES TOO
Eg checking matches to Getty AAT Single sign-on a click per item Easy
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6 LODLAM PROJECTSGLAM and DH projects present a bewildering variety eg
Publishing VocabulariesThesauri as LODPublishing Museum collections and National Bibliographies as LODEnrichment of GLAM metadata with relevant thesauri semantic and faceted searchStudy of artistic in uence over time and spaceLiterary traditions parallel editionsPoetic repertoriesStudying manuscripts stematology (manuscript derivation)HistoriographyStudying charters prosopography (micro biographies) Prosopography is Greek forFacebook 2015SNAPDRGN project
Research functions and sometimes integrated into Virtual Research Environments
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
61 MELLON SPACE PROJECTSThe Andrew Mellon Foundation funds many projects in CH and DH and a few softwareprojects including
CollectionSpace museum collection managementArchiveSpace archive managementResearchSpace semantic integration based on CIDOC CRM search data amp imageannotation data basket etcConservationSpace line of business application for conservation specialists
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
62 RESEARCHSPACEExecuted by the British Museum Ontotext developed the rst prototype (2010-2013)Semantic Search
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
621 RESEARCHSPACE SEARCH
Powerful and precise search Drawings by Rembrandt that are about Mammals
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
622 RESEARCHSPACE SEARCH FUNDAMENTAL RELATIONS
First implementation experience of the CIDOC CRM Fundamental Relations approach
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
623 RESEARCHSPACE SEARCH ONE FR (THING FROM PLACE)
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
624 RESEARCHSPACE SEARCH IMPLEMENTATION
120 GraphDB rules weaved using Literate Programming approach Inferencedependencies between props (text=input gray=intermediate white=output)
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
625 RESEARCHSPACE SEARCH NEW IMPLEMENTATION
(Not Ontotext work) (DOldman)Watch the video
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
626 RESEARCHSPACE DATA ANNOTATION
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
627 RESEARCHSPACE DATA ANNOTATION MODEL
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
628 IMAGE ANNOTATION
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
629 IMAGE ANNOTATION MODEL
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6210 IMAGE ANNOTATION ARCHITECTURE
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
63 BRITISH MUSEUM (BM) AND YCBA LODGraphDB runs the BM SPARQL endpoint One of the biggest CH RDF collections(917M triples)As part of RS developed mapping of BM data (2M objects) with BM using CIDOCCRMThis mapping was followed by the Yale Center for British Art (YCBA)
very comprehensive but is monolithic and has imprecisionsIncludes the (in)famous diagramMapping Documentation
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
64 CONSERVATIONSPACEExecuted by a consortium led by US National Gallery of Art Developed by Sirma ITT(Ontotext sibling) Based on Ontotext GraphDB (semantic metadata) Alfresco(document management) Smart Documents (Sirma product)
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
65 EUROPEANA LOD AND OAI PMHOntotext crated and hosted the Europeana SPARQL and OAI PMH services
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
651 EUROPEANA STATISTICS
Eg chart of newspapers (several millions) by year cant do this using the Europeana APIbut is easy with SPARQL
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
66 EUROPEANA FOOD AND DRINKFood amp Drink content semantically enriched (place and FD topic) open data SPARQL endpoint open source (Github) Uses GraphDB and ElasticSearchenterprise connector
EFD Semantic App
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
661 TASTY BULGARIAN RECIPES
Eg 150 with beer including pancakes
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
662 WIDE GEOGRAPHIC COVERAGE
Objects from the Roman Empire to Antarctica (Scotts expedition to the South Pole) andeverything in-between
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
663 EFD ENRICHMENT FD GAZETTEER
Use Wikipedia Categories to extract a FD Gazetteer
Domain-speci c modeling Towards a Food and Drink Gazetteer Tagarev A TolosiL and Alexiev V LNCS 9398 p182-196 January 2016 ( )preprint
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
664 EFD ENRICHMENT PRUNING FD CATEGORY TREE
Alexiev V DBpedia meeting February2016Using DBPedia in Europeana Food and Drink
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
665 EFD ENRICHMENT FRENCH
Selected French as second enrichment language after English considering categoryoverlap (work by LTolosi x-axis is cat level) available content NLP capabilities
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
666 EFD PLACE ENRICHMENT
We used standard Ontotext Concept Enrichment Service which is a mix ofDBpedia+Wikidata But also had to add Geonames to leverage the place hierarchy
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
667 EFD PLACE ENRICHMENT
Hierarchical semantic facet based on Geonames
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
668 EFD GEOGRAPHIC MAPPING CLUSTERING
Once we have places its relatively easy to map them We used the Cluster Mapperlibrary
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
669 EFD GEOGRAPHIC MAPPING JITTERING
There are 9k objects marked Bulgaria We dont want all ags in the center of Bulgariaso we jitter them up
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6610 GLAMS WORKING WITH WIKIDATA
Why should GLAMs bother about Wikidata Because it gives an excellent way toconnect and expose your collection data to a multilingual audience
Recommendation 1 For every Europeana project considering the possiblebene ts of a Wikimedia component should be default behaviorRecommendation 7 Make Wikidata a central element of Europeanas portal toplatform strategyRecommendation 8 Europeana should continue to invest in technology thatimproves the interoperability between GLAMs and Wikimedia platforms
easily add content about a colorful traditionblessing of the baskets (swiecenie koszyczek or just Święconka in Polish) Withproper cats when we merge them across languages (pl en de) we discover thecontent is about Food and Drink Easter and a Polish tradition
Europeana Wikimedia Taskforce report
GLAMs Working with Wikidata
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
67 GETTY VOCABULARY PROGRAM LODGVP well-known and respected in GLAM Dependencies AAT-TGN-ULAN-CONACenter of LODLAM cloud (Diagram by JCobb 2014)GVP Training Materials
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
671 GVP LOD RELEASES
Publicized in blog posts by JCuno head ofthe Getty TrustAAT 2014-02 TGN 2014-08 ULAN 2015-03
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
672 ONTOTEXT SCOPE OF WORK
Semanticontology development Contributed to (latest standard on thesauri) Providedimplementation experience suggestions and xesComplete mapping speci cationHelp implement R2RML scripts working off Gettys Oracle database contribution toPerl implementation (RDB2RDF) R2RML extension (rrxlanguageColumn)Work with a wide External Reviewers group (people from OCLC Europeana ISO25964 working group etc)GraphDB semantic repo clustered for high-availabilitySemantic application development (customized Forest user interface) and techconsultingSPARQL 11 compliant endpoint Comprehensive documentation (100 pages) Sample queries (100) including charts geographic queries etcPer-entity export les explicittotal data dumps Many formats RDF Turtle NTriplesJSON JSON-LDHelp desk support on twitter and google group (see home page)Presentations papers
Alexiev V Lindenthal J and Isaac A International Journal on DigitalLibraries August 2015 Springer
httpvocabgettyeduontologyISO 25964 ontology
httpvocabgettyedusparqlhttpvocabgettyedudoc
On the composition of ISO 25964 hierarchical relations (BTGBTP BTI)
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
673 COMPLETE REPRESENTATION OF ALL GVP INFO
See VAlexiev CIDOC 2014External Ontologies
GVP LOD Ontologies and Semantic Representation
Pre x Ontology Used forbibo Bibliography Ontology Sourcesdc Dublin Core Elements commondct Dublin Core Terms commonfoaf Friend of a Friend ontology Contributorsiso ISO 25946 (latest on thesauri) isoThesaurusArray BTGBTPBTIowl Web Ontology Language Basic RDF representationprov Provenance Ontology Revision historyrdf Resource Description Framework Basic RDF representationrdfs RDF Schema Basic RDF representationschema Schemaorg common geo (TGN) bio (ULAN)skos Simple Knowledge Organization System Basis vocabulary representationskosxl SKOS Extension for Labels Rich labelswgs W3C World Geodetic Survey geo Geo (TGN)xsd XML Schema Datatypes Basic RDF representation
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
674 GVP SEMANTIC REPRESENTATION (1)
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
675 GVP SEMANTIC REPRESENTATION (2)
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
676 KEY VALUES (FLAGS) ARE IMPORTANT
Excel-driven Ontology Generationtrade Key val can be mapped to Custom sub-classCustom (sub-)prop Ontology Value (eg lttermkindAbbreviationgt)
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
677 ASSOCIATIVE RELATIONS ARE VALUABLE
More Excel-driven Ontology GenerationtradeRelations come in owlinverseOf pairs (or owlSymmetricProperty self-inverse)
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
678 INVOLVED INFERENCE OF HIERARCHICAL RELATIONS
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
679 COMPREHENSIVE DOCUMENTATION
Alexiev V Cobb JGarcia G Harpring P Getty Research Institute 32 edition March 2015Getty Vocabularies Linked Open Data Semantic Representation
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6710 SAMPLE QUERIES (100) INTEGRATED UI
Some charts eg Year Joined UN (TGN) Pope Reign Durations (ULAN)
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6711 GVP VOCABS USAGE
Collected about 100 usages of the vocabs many in Collection Management and SearchMany described in JCobb 2014 EgGetty Vocabs Why LOD Why Now
AAT used in nds bibliographic and authority data languagecodes geographic area codes publication country codes AACR2 abbreviations LCmain entry Cutter numbers AAT concepts etc
Cataloging Calculator
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6712 AAT IN EUROPEANA
typesubjectmaterial eldsPartagePlus matched Art Nuveau candidate concepts to AAT enriched labelsEuropeana uses AAT to enrich
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
68 JPGETTY MUSEUMWorking with JPGM on publishing LOD Considering CIDOC CRM maybe also simplerontologies Hoping to generate R2RML from instance examples like
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
681 JPGETTY MUSEUM AND WIKIDATA
Discussing making data for Wikidata WD has 480 Getty paintings but the Museum has180k artworks WD query shown as image grid
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
69 AMERICAN ART COLLABORATIVE 14 US art museums committed to establishing a critical
mass of LOD on the semantic web Consulting on CRM mappingAmerican Art Collaborative
Work ongoing at eg see Eg possible mapping of (sculpture) Cast after
httpsgithubcomamerican-art NPG mapping issues
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
610 EUROPEAN HOLOCAUST RESEARCH INFRASTRUCTUREEHRI is a large-scale EU project that involves 23 Holocaust archives (Europe Israel andthe US) DH and IT organizations
In its rst phase (2011-2015) it aggregated archival descriptions and materials on alarge scale and built a Virtual Research Environment (portal) for Holocaustresearchers based on a graph databaseIn its second phase (2015-2019) EHRI2 seeks to enhance the gathered materialsusing semantic approaches enrichment coreferencing interlinking Semanticintegration involves Four of the 14 EHRI2 work packages and helps integratedatabases free text and metadata to interconnect historical entities (peopleorganizations places historic events) and create networks
Semantic Archive Integration for Holocaust Research the EHRI ResearchInfrastructure VAlexiev LBrazzo CIDOC Congress 2016
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6101 EHRI PERSON NETWORKS
Research question how person networks in uenced chance of survival Idea
Rec 123456 rstName ldquoJohnrdquo lastName ldquoSmithrdquo gender Male dateMarriage 1921-01-05 additional names nameSpouseMaiden ldquoMatienzordquo nameSpouse ldquoMaria SmithrdquonameChild ldquoMike Smithrdquo nameSibling ldquoJack JonesrdquoWe can create Person records for the people mentioned make some likely inferencesthen try to match to other Person records in the database
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6102 EHRI LARGE-SCALE PLACE MATCHING
Match USHMM places to Geonames also achieving deduplication A Geonamesmatching pipeline in free text was also developed
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6103 EHRI ORAL HISTORY INTERVIEWS
Analyze 25k OH Interviews
ONTO Place enrichment Person name recognitionINRIA word2vec experiments
guard Cos dist punishment Cos dist
guard Cos dist punishment Cos dist
guarding 0593507 punishments 0668144sentry 0512083 punish 0601212hlinka 0496201 punishing 0543213gate 0490032 beatings 0527033watching 0484647 penalty 0497262ri e 0484379 deserved 0490157lookout 0482025 beaten 0473870patrol 0477233 straf 0473338soldier 0475982 offense 0461230guarded 0474689 executing 0459965police 0474291 merciless 0455123
semantic differencing (interesting)KGB ‐ Stalin + Hitler = SS
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6104 EHRI DISCOVERING CAMPS GHETTOS STALAGS
And referencing to Geonames so we can get coordinates
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
611 OTHERS PROJECTS WIKIARTHISTORYVienna University of Technology ( )site paper
Art History networks from Wikipedia through VIAF idTime and nationality from ULAN
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
612 CHARTEXNLP analysis of medieval Charters and Deeds Funded by Digging Into Data cross-country SSH funding initiative Visualized with BRAT
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
613 NUMISMATICSMy good friend at the American Numismatic Society has developed a hostof amazing software that uses and produces LOD
Ethan Gruber
Numishare Data platform for coinsmedals 100k coin typesNomisma Shared authorities for numismaticsKerameikos Pottery LODEADitor EAD Editor based on XML amp XForms usesproduces LODxEAC EACCPF Editor based on XML amp XForms usesproduces LOD
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6131 COINS IN TIME AND SPACE
Spatiotemporal distribution of hoards containing a particular Roman Republican cointype Below examples of this type in partner collections
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6132 GEOGRAPHIC DISTRIBUTION
Distribution of the Roman denarius blue dots for mints heatmap of nds (a lot in the UKPortable Antiquities Scheme)
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6133 NUMISHARE
Data platform with over 100k coin types Powers custom collections eg Medalic Art of the Great War
Art ofDevastation
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6134 NOMISMA
Shared authorities for numismatics Eg a mint
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6135 COINHOARDS
Greek coin data provided by Geo mapping data provided by Below reference to the coin in an archival notebook (linked via OA)
CoinHoardsorgnomismaorg
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6136 STATISTICAL CHARTS
Denominations issued by Augustus Tiberiushellip rendered in a chart using d3js
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6137 KERAMEIKOS POTTERY LOD
editor Based on XForms leverages Getty and BM LODKerameikos Project
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki
6138 EADITOR AND XEAC
Based on XForms Leverages the Getty thesauri and VIAF imports data asneededBlog Wiki