DIVE and DSS Kranten als Big Data

Post on 15-Jul-2015

90 views 2 download

Tags:

Transcript of DIVE and DSS Kranten als Big Data

Dutch Ships and Sailors

Victor de Boer - v.de.boer@vu.nl

Digitale historische kranten als big data 24-3-2015

DIVE

Dutch Ships and Sailors

Victor de Boer, Matthias van Rossum, Jur Leinenga, Rik Hoekstra

With input from Andrea Bravo Balado and Robin Ponstein

Netherlands Institute for Sound and Vision / VU University Amsterdam v.de.boer@vu.nl

The Problem:((Maritime) historical) data is not integrated

25+ Maritime datasets; Heterogeneous

The solution

Well, Linked Data obviously!

KB Delpher

Dutch-Asiatic Shipping (DAS) –Voyages (Huygens ING)

“VOC Opvarenden”Mustering and payroll information (DANS Easy)

Dutch Ships and Sailors

DAS

GZMVOC

MDB

VOCOPVBegunstig

den

VOCOPVSoldijboek

en

PROV

AAT

VOCOPVOpvaren

den

foaf

owl:sameAs

dss:hasKBLink

rdfs:subClassOf,rdfs:subPropertyOf

dss:DAS link

skos :exactMatch

Links to original scans

Linking to Historical newspapers

• Use ML to detect links between ships and historical newspaper articles (delpher.nl)

– Features: ship name, time intervals, captain’s names, ship type, named entities, keywords, background knowledge

• 179,120 links

- Andrea Bravo Balado

Example

[HARLINGEN, 24 October.] . «et gestrande

Zweedsche schip , waarvan wij ons vorig no.

melding maakten , is door de 'eepboot van hier

afgebragt en hier binnengede u BiJ die

gelegenheid werd ons medegeeeid, dat nog vier

vaartuigen op Terschelling aren gestrand.

Tevens is het berigt ontvan°e > dat het hier

behoorende schoonerschip Transit, kapitein

Schaap, in de Noordzee is gezonken, nadat het

achterschip was weggeslagen ; een ligtmatroos

verloor daarbij het leven. Mede zijn hier drie

vreemde schepen met meer en minder zware

averij binnengeloopen.Spoiler alert! It sank in the North Sea.

Provenance (PROV-O)

• Individual named graphs have provenance information

– Who made it (people/software?)

– Based on what source

– Content confidence

• Matches historical

science requirements

Data analysis and visualisation

Take home

• Linked Data principles are a great fit to digital history requirements– Heterogeneous models/datasets, light-weight

reusable integration

– Multiple levels of normalisation, through separate named graphs

– SW Provenance matches Historical Provenance

• Watch out when you sail your Schooner into the North Sea

DIVE INTO THE EVENT-BASED

BROWSING OF LINKED HISTORICAL MEDIAVICTOR DE BOER, JOHAN OOMEN, OANA INEL, LORA AROYO, ELCO VAN STAVEREN, WERNER HELMICH AND DENNIS DE BEURS

DIGITAL HUMANITIES

RESEARCHERS

Med

ia researcher Lars A

rveR

øsslan

do

f the U

niversity o

f Bergen

. (Ph

oto

: An

dreas R

. Graven

) h

ttps://w

ww

.flickr.com

/ph

oto

s/drain

rat/14

77

99

289

98

/

EXPLORATIVE SEARCH

Erp, M. van; Oomen, J.; Segers, R.; Akker, C. van de; Aroyo, L.; Jacobs, G.; Legêne, S; Meij, L. van der; Ossenbruggen, J.R. van; Schreiber, G. Automatic Heritage Metadata Enrichment with Historic Events Museums and the Web 2011 http://www.museumsandtheweb.com/mw2011/papers/automatic_heritage_metadata_enrichment_with_hi

DATA: OPENIMAGES.EU

Open videos Netherlands Institute for Sound and Vision

3000, mostly news broadcasts

DATA: DELPHER.NL

Scans of Radio bulletins (hand annotated)

• 1937 – 1984

• 1.5 Million OCR’ed and NErred

ENTITY EXTRACTION

• CROWDTRUTH.ORG

ENTITY EXTRACTION

EVENTS CROWDSOURCING AND LINKING TO CONCEPTS THROUGH CROWDTRUTH.ORG

SEGMENTATION & KEYFRAMES

LINKING EVENTS AND CONCEPTS TO KEYFRAMES

SIMPLE EVENT MODEL (SEM), OPENANNOTATION (OA) AND SKOS

DIVE:MEDIA OBJECT

SEM:EVENT

SEM:PLACE

SEM:TIME

SEM:ACTOR

SKOS:CONCEPT

OA:ANNOTATION

• LINKS TO EUROPEANA (MULTILINGUAL)• LINKS TO DBPEDIA

INFINITY OF EXPLORATION

http

s://ww

w.flickr.co

m/p

ho

tos/m

ibu

chat/2

77

42

51

41

5h

ttps://w

ww

.flickr.com

/ph

oto

s/ben

jcarson

/24

51

71

88

5

DIGITAL SUBMARINE UI

DEMO

• DIVE.BEELDENGELUID.NL

THANK YOU

http

s://ww

w.flickr.co

m/p

ho

tos/ro

bysalto

ri/

DIVE.BEELDENGELUID.NLv.de.boer@vu.nl