CrowdTruth for Digital Hermeneutics

50
CrowdTruth for Digital Hermeneutics Human-assisted computing for understanding of events in cultural heritage Lora Aroyo http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

description

 

Transcript of CrowdTruth for Digital Hermeneutics

Page 1: CrowdTruth for Digital Hermeneutics

CrowdTruth for Digital Hermeneutics

Human-assisted computing for understanding of events in cultural heritage

Lora Aroyo http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 2: CrowdTruth for Digital Hermeneutics

The  Gallery  of  Cornelis  van  der  Geest  

Page 3: CrowdTruth for Digital Hermeneutics
Page 4: CrowdTruth for Digital Hermeneutics
Page 5: CrowdTruth for Digital Hermeneutics

DIGITAL  HERMENEUTICS  

theory of interpretation: relation parts of wholes events as context for interpretation of online collections

intersection of hermeneutics & Web technology

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 6: CrowdTruth for Digital Hermeneutics

Enrichment with Events

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 7: CrowdTruth for Digital Hermeneutics

Events Narrative

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 8: CrowdTruth for Digital Hermeneutics

Demo Datasets Rijksmuseum

– 159,860 Artworks – 71,851 Concepts – 73,374 Persons

Sound & Vision – 10,000 Videos – 172,000 Concepts

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 9: CrowdTruth for Digital Hermeneutics

Issues  

•  How  to  get  (extract)  the  events?  – needs  to  be  scalable  and  automated  

•  How  to  collect  ground  truth  for  training?  – experts  don’t  do  as  good  job  as  they  think  

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 10: CrowdTruth for Digital Hermeneutics

Flickr: vanilllaph  

What  are  Events?  events perdure = their parts exist at different time points

objects endure = they have all their parts at all points in time

objects are wholly present at any point in time, events unfold over time

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 11: CrowdTruth for Digital Hermeneutics

Events  are  Vague  humans have no clear notion of what events are  

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 12: CrowdTruth for Digital Hermeneutics

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Events  have  Perspec/ves  and people don’t always agree  

Page 13: CrowdTruth for Digital Hermeneutics

“event is a significant happening or gathering of people. I would define a happening as an event if the group of people gathered were united in one common goal.”

We asked the crowd what an EVENT is ...

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 14: CrowdTruth for Digital Hermeneutics

We asked the crowd what an EVENT is ...

“Event is a happening, which can be scheduled or unscheduled. An earthquake or fire happens (unscheduled). A wedding or birthday party (scheduled). It is an occasion that is unusual and tends to be memorable.”

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 15: CrowdTruth for Digital Hermeneutics

“An event would be any occurrence where physical action has taken place. It may be a single, momentary instance (I sneezed), or it may span a period of time (the festival ran for four hours). An event may also be made up of a number of smaller events, such as a day at school is an event, but each individual class is also an event itself. Basically an event must have a physical action over any delimited time span.”

We asked the crowd what an EVENT is ...

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 16: CrowdTruth for Digital Hermeneutics

“A planned public or social get together or occasion.”  

“an event is an incident that's very important or monumental”  

“An event is something occurring at a specific time and/or date to celebrate or recognize a particular occurrence.”  

“a location where something like a function is held. you could tell if something is an event if there people gathering for a purpose.”  

“Event can refer to many things such as: An observable occurrence, phenomenon or an extraordinary occurrence.”  

We asked the crowd what an EVENT is ...

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 17: CrowdTruth for Digital Hermeneutics

“an  event  is  the  exemplifica:on  of  a  property  by  a  substance  at  a  given  :me” Jaegwon  Kim,  1966  “events  are  changes  that  physical  objects  undergo”  Lawrence  Lombard,  1981  

“events  are  proper:es  of  spa:otemporal  regions”,  David  Lewis,  1986  

under30ceo.com   http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 18: CrowdTruth for Digital Hermeneutics

Gold Standard Assumption

•  Systems  need  to  be  told  what  is  right  &  what  is  wrong  with  a  gold  standard  or  ground  truth  

•  Performance  is  measured  on  test  sets  veHed  by  human  experts  à  never  perfect,  always  improving  against  test  data  

•  Historically,  gold  standards  are  created  assuming  that  for  each  annotated  instance  there  is  a single right answer

•  Gold  standard  quality  is  measured  in  inter-annotator agreement à does  not  account  for  perspec:ves,  for  reasonable  alterna:ve  interpreta:ons  

Page 19: CrowdTruth for Digital Hermeneutics

HOW  DO  WE  SCALE  &  AUTOMATE  SOMETHING    FOR  WHICH  THERE  IS  SO  MUCH  DISAGREEMENT?  

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 20: CrowdTruth for Digital Hermeneutics

Position Annotator disagreement is not noise, but signal. Not a problem to overcome but a source of information for machines

Artificially restricting humans does not help machines to learn.

They will learn better from diversity

Page 21: CrowdTruth for Digital Hermeneutics

Crowd Truth Annotator disagreement is indicative of the variation in human semantic interpretation of signs, and can indicate ambiguity, vagueness, over-generality, etc.

http://www.freefoto.com/preview/01-47-44/Flock-of-Birds

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 22: CrowdTruth for Digital Hermeneutics

HOW DO WE COLLECT & REPRESENT DISAGREEMENT SO THAT IT CAN BE

HARNESSED?

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 23: CrowdTruth for Digital Hermeneutics

•  Use crowdsourcing to get multiple perspectives (in the collection)

•  Automatically generate examples (for scalability) •  Multiple people annotate each example •  Represent the annotations (the result) in a way that

captures the disagreement

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 24: CrowdTruth for Digital Hermeneutics

CrowdTruth Framework for Medical Relation Extraction

Aroyo, L., Welty, C.: Crowd Truth: Harnessing disagreement in crowdsourcing a relation extraction gold standard. WebSci2013. ACM, 2013 Aroyo, L., Welty, C.: Truth is a Lie: 7 Myths about Human Annotation, AI Magazine, 2014 (in print)

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 25: CrowdTruth for Digital Hermeneutics

CrowdTruth Framework for News Event Extraction

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 26: CrowdTruth for Digital Hermeneutics

The police came to Apple’s glass cube on Fifth Avenue on Tuesday to enforce order after activists released black balloons inside the cube to [protest] the company’s environmental policies.

The police came to Apple’s glass cube on Fifth Avenue on Tuesday [to enforce] order after activists released black balloons inside the cube to protest the company’s environmental policies.

The police came to Apple’s glass cube on Fifth Avenue on Tuesday [to enforce order] after activists released black balloons inside the cube to protest the company’s environmental policies.

The police came to Apple’s glass cube on Fifth Avenue on Tuesday to enforce order after activists [released] black [balloons] inside the cube to protest the company’s environmental policies.

The police [came]to Apple’s glass cube on Fifth Avenue on Tuesday [to enforce] order after activists released black balloons inside the cube to protest the company’s environmental policies.

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 27: CrowdTruth for Digital Hermeneutics

Event Semantics are Hard event type: Semafor event location type: GeoNames event time type: Allen’s time theory, KSL time ontology event participant type: based on proper nouns classes

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Inel, O., Aroyo, L., et al (2013). Domain-independent Quality Measures for Crowd Truth Disagreement, DeRIVE2013

Page 28: CrowdTruth for Digital Hermeneutics

Events have multiple DIMENSIONS M

icro

-task

Tem

plat

e

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 29: CrowdTruth for Digital Hermeneutics

Each DIMENSION has different GRANULARITY M

icro

-task

Tem

plat

e

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 30: CrowdTruth for Digital Hermeneutics

People have different POINTS OF VIEWS M

icro

-task

Tem

plat

e

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 31: CrowdTruth for Digital Hermeneutics

Why do people disagree?

Sign

Reference Observer

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 32: CrowdTruth for Digital Hermeneutics

Why do people disagree?

Sentence

Annotation Task Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 33: CrowdTruth for Digital Hermeneutics

Disagreement Analytics

•  sentence metrics: sentence clarity, sentence-relation score •  annotation task metrics: event clarity, event type similarity, relation ambiguity •  worker metrics:

o  worker-sentence disagreement o  worker-worker disagreement o  avg number of annotations per sentence o  valid words in explanation text o  same explanation across contributions o  “[OTHER]” + different type o  time to complete, number of sentences, etc.

Soberón G.,Aroyo, L., et al (2013): Crowd truth metrics. CrowdSem2013 Workshop Aroyo, L., Welty, C.: (2013) Measuring crowd truth for medical relation extraction. AAAI Fall Symposium on Semantics for Big Data

http://www.americanprogress.org/wp-content/uploads/2012/12/multiple_measures_onpage.jpg http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 34: CrowdTruth for Digital Hermeneutics

Experimental Setting

•  70 putative events •  8 experiments 2 for each

event role filler •  annotations 15 workers

per putative event • max annot./worker 10 • workers native English

speakers on CF

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 35: CrowdTruth for Digital Hermeneutics

Annotation Example Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes. Overall annotation & granularity distribution:

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 36: CrowdTruth for Digital Hermeneutics

Even

t Typ

e D

isag

reem

ent

[ENTERED]

ACTION (18.2%)

MOTION (9.1%)

ARRIVING_OR_ DEPARTING (54.5%)

PURPOSE (18.2%)

Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes.

type type

type type

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 37: CrowdTruth for Digital Hermeneutics

Even

t Loc

atio

n D

isag

reem

ent

[ENTERED]

the cube (38.5%)

cube (38.5%)

none (23%)

NOT APPLICABLE

(100%)

OTHER (100%) type

type

COMMERCIAL (40%)

OTHER (40%)

INDUSTRIAL (20%)

type

type

type

Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes.

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 38: CrowdTruth for Digital Hermeneutics

Even

t Tim

e D

isag

reem

ent

[ENTERED]

Around (9.1%)

Around 2:30 p.m. (45.45%)

2:30 p.m. (45.45%)

TIMESTAMP (100%)

TIMESTAMP (100%)

type

type

TIMESTAMP (100%)

type

Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes.

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 39: CrowdTruth for Digital Hermeneutics

Even

t Par

ticip

ant D

isag

reem

ent

[ENTERED]

Greenpeace (15.39%)

demonstrators (15.39%)

Greenpeace demonstrators

(69.23%)

PERSON (100%)

ORGANIZATION (100%)

type

type

ORGANIZATION (77.77%)

PERSON (22.22%)

type

type

Around 2:30 p.m., as if delivering birthday greetings, several Greenpeace demonstrators [ENTERED] the cube clutching helium-filled balloons, which were the shape and color of charcoal briquettes.

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 40: CrowdTruth for Digital Hermeneutics

Comparative Annotation Distribution Event Type Distribution Time Type Distribution

The high disagreement for event type across all sentences likely indicates problems with the ontology. These event types are difficult to distinguish between. The event classes may overlap, be confusable, too vague, etc.

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 41: CrowdTruth for Digital Hermeneutics

Comparative Annotation Distribution Location Type Distribution Participant Type Distribution

Sentence

Ontology Worker

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 42: CrowdTruth for Digital Hermeneutics

Sentence Clarity Identifies sentences that are unclear or ambiguous based on the distribution of types

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 43: CrowdTruth for Digital Hermeneutics
Page 44: CrowdTruth for Digital Hermeneutics

Spam Detection

From the TRIANGLE we aim to efficiently remove the spam & low-quality contributors:

o  we filter sentences based on their clarity score first in order to avoid penalizing workers for contributing on difficult or ambiguous sentences; When bad sentences are identified we remove them and see significant increase of accuracy on spam detection

o  apply the worker metrics to analyze worker agreement to identify workers who systematically disagree (1) with the opinion of the majority (worker-sentence disagreement), or (2) with the rest of their co-workers (worker-worker disagreement); When spammers are identified we remove their annotations and the accuracy of the sentence metrics improves

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 45: CrowdTruth for Digital Hermeneutics

What more … Understand human disagreement on event extraction with focus on ambiguity:

○  Would different classification (ontology) of putative

events perform better? ○  Does the overlapping of the types (ontology) influence

the results? ○  Identify the right role fillers (per event) for multiple

putative events. ○  Would event clustering help with determining the most

appropriate structure of the event and its role fillers?

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 46: CrowdTruth for Digital Hermeneutics

Conclusions

●  Capturing events is important for digital hermeneutics

●  Understanding disagreement helps understand event semantics

●  Considering the interdependence of the different aspects of the annotations improves their quality

●  Disagreement metrics adaptable across domains - helped us to understand the vagueness and the clarity of a sentence/putative event

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Sentence

Annotation task Worker

Page 47: CrowdTruth for Digital Hermeneutics

The Crowd Truth Crew

•  Lora  Aroyo,  PI  (VU)  •  Chris  Welty,  PI  (IBM)  •  Robert-­‐Jan  Sips  (IBM)  •  Anca  Dumitrache,  PhD  candidate  (VU-­‐IBM)  •  Oana  Inel,  Lukasz  Romasko,  researchers  (VU)  •  Students:  Khalid,  Rens,  Benjamin,  TaXana,  HarrieYe  •  Engineers:  Jelle,  Arne  (IBM)  

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 48: CrowdTruth for Digital Hermeneutics

hYp://crowd-­‐watson.nl  

Page 49: CrowdTruth for Digital Hermeneutics

AGORA Eventing History

Susan Legêne Chiel van den Akker

VU History department

VU Computer Science

Guus Schreiber Lora Aroyo

Geertje Jacobs

Johan Oomen

http://agora.cs.vu.nl

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo

Page 50: CrowdTruth for Digital Hermeneutics

Events @ •  Agora: Historical Events in Cultural Heritage Collections

–  http://agora.cs.vu.nl/

•  Extractivism: Activist Events in Newspapers –  http://mona-project.org/

•  Semantics of History

–  http://www2.let.vu.nl/oz/cltl/semhis/

•  BiographyNet: Events Change in Perspective over Time

•  NewsReader: Multilingual Events & Storylines in Newspapers –  http://www.newsreader-project.eu/

http://lora-aroyo.org http://slideshare.net/laroyo @laroyo