http://lora-aroyo.org @laroyo
Bulgaria NYC
The Netherlands
2
ABOUT ME ... Personal Data Science
Sofia
http://lora-aroyo.org @laroyo 3
E-LEARNING & AI
To understand the value of semantic technologies for e-learning
we need to understand the people, specifically how they
interact and consume information
⌂ http://lora-aroyo.org @laroyo
MY RESEARCH FAMILY
4
… and many many many many more
⌂ http://lora-aroyo.org @laroyo
MY RESEARCH FAMILY
5
… and many many many many more
http://lora-aroyo.org @laroyo
EVOLUTION OF SEMANTIC WEB
7
Great moments from 1980s till now
http://lora-aroyo.org @laroyo
EMPIRE OF THE EXPERTS
8
80’s
Advances in hardware and SDEsPCs, workstations, Symbolics, SunNew architectures like Hypercube LISP, Prolog, OPSAI can now BUILD SYSTEMS
Primary focus on experts and rules
What is the knowledge of expertsGraphs, logic, rules, frames
How do experts reason?Deduction, induction
Work on form & process academicinside the system, to make the
reasoning inside the system proper and as good as possible
Industry forged ahead with ad-hoc & proprietary systems and actually tried to build expert systems
Originals of uncertain KRFuzzy, probabilistic
http://lora-aroyo.org @laroyo
EMPIRE OF THE EXPERTS
9
80’s
Piero Bonissone and the DELTA/CATS expert system for locomotive repair with David Smith, a locomotive repair expert
http://lora-aroyo.org @laroyo
EMPIRE OF THE EXPERTS
10
80’s
Buchanan and Shortliff’s MYCIN project at Stanford built a huge rule base for medical diagnosis working with an extensive team of medical experts.
http://lora-aroyo.org @laroyo
KNOWLEDGE ACQUISITION FROM EXPERTS
11
90’s
Common KADS
founded by Bob Wielinga as a methodology for expert knowledge acquisition. It was deeply psychology based - it was about people, about their knowledge and especially about their expertise. How do people know what they know, and how can you acquire that knowledge?
http://lora-aroyo.org @laroyo
17
BIG CROWDS10’s
Human Annotation Central in Machine Learning Training & Evaluation
http://lora-aroyo.org @laroyo
ONE TRUTH
19
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every example
7 MYTHS ABOUT HUMAN ANNOTATION
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 20
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or false
7 MYTHS ABOUT HUMAN ANNOTATION
ALL EXAMPLES EQUAL
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 21
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problem
7 MYTHS ABOUT HUMAN ANNOTATION
DISAGREEMENT BAD
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 22
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problemExperts rule: knowledge is captured from domain experts
7 MYTHS ABOUT HUMAN ANNOTATION
EXPERTS RULE
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 23
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problemExperts rule: knowledge is captured from domain expertsOne is enough: knowledge by a single expert is sufficient
7 MYTHS ABOUT HUMAN ANNOTATION
ONE IS ENOUGH
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 24
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problemExperts rule: knowledge is captured from domain expertsOne is enough: knowledge by a single expert is sufficientDetailed explanations help: if examples cause disagreement - add instructions Once done, forever valid: knowledge is not updated; new data not aligned with old7 MYTHS ABOUT HUMAN ANNOTATION
BINARY WORLD
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 25
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
DOES THIS SENTENCE EXPRESS TREATS RELATION?
http://lora-aroyo.org @laroyo 26
For prevention of malaria, use only in individuals traveling
to malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
DOES THIS SENTENCE EXPRESS TREATS RELATION?
http://lora-aroyo.org @laroyo 27
For prevention of malaria, use only in individuals traveling
to malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
DOES THIS SENTENCE EXPRESS TREATS RELATION?
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
http://lora-aroyo.org @laroyo 28
For prevention of malaria, use only in individuals traveling
to malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
WHAT DO EXPERTS SAY?
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
✓
✓
✘
http://lora-aroyo.org @laroyo 29
For prevention of malaria, use only in individuals traveling
to malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
WHAT DOES A LAY ANNOTATOR SAY?
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
✓
✓
✘
http://lora-aroyo.org @laroyo 30
For prevention of malaria, use only in individuals traveling
to malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
WHAT DOES ANOTHER LAY ANNOTATOR SAY?
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
✓
✘
✘
http://lora-aroyo.org @laroyo 31
For prevention of malaria, use only in individuals traveling
to malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
WHAT DOES A THIRD LAY ANNOTATOR SAY?
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
✓
✓
✓
http://lora-aroyo.org @laroyo 32
For prevention of malaria, use only in individuals traveling
to malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
WHAT DOES THE CROWD SAY?
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.Intuition: This is better
95%
75%
50%
http://lora-aroyo.org @laroyo 33
For prevention of malaria, use only in individuals traveling
to malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
95%
75%
50%
There’s a difference between these two
This one isn’t utterly wrong
BETTER
WORSE
WHAT DOES THE CROWD SAY?
http://lora-aroyo.org @laroyo 34
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problemExperts rule: knowledge is captured from domain expertsOne is enough: knowledge by a single expert is sufficientDetailed explanations help: if examples cause disagreement - add instructions Once done, forever valid: knowledge is not updated; new data not aligned with old
COMFORT ZONEDisrupted
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 35
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problemExperts rule: knowledge is captured from domain expertsOne is enough: knowledge by a single expert is sufficientDetailed explanations help: if examples cause disagreement - add instructions Once done, forever valid: knowledge is not updated; new data not aligned with old
COMFORT ZONEDisrupted
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 36
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problemExperts rule: knowledge is captured from domain expertsOne is enough: knowledge by a single expert is sufficientDetailed explanations help: if examples cause disagreement - add instructions Once done, forever valid: knowledge is not updated; new data not aligned with old
COMFORT ZONEDisrupted
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 37
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problemExperts rule: knowledge is captured from domain expertsOne is enough: knowledge by a single expert is sufficientDetailed explanations help: if examples cause disagreement - add instructions Once done, forever valid: knowledge is not updated; new data not aligned with old
COMFORT ZONEDisrupted
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 38
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problemExperts rule: knowledge is captured from domain expertsOne is enough: knowledge by a single expert is sufficientDetailed explanations help: if examples cause disagreement - add instructions Once done, forever valid: knowledge is not updated; new data not aligned with old
COMFORT ZONEDisrupted
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 39
For prevention of malaria, use only in individuals traveling to
malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
ENCOURAGING DISAGREEMENT
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Treats: Chloroquine, Malaria
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.Intuition: This is better
95%
75%
50%
http://lora-aroyo.org @laroyo
WORKER VECTOR FOR A SENTENCE
treats associated _with othersymptom
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
http://lora-aroyo.org @laroyo
MANY WORKERS FOR THE SAME SENTENCE
treats otherassociated _withsymptom
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
http://lora-aroyo.org @laroyo
ALL WORKER VECTORS AGGREGATED IN A SENTENCE VECTOR
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
treats othernoneassociated _withsymptommanifestationside
effect
http://lora-aroyo.org @laroyo
SENTENCE VECTORS FOR THE 3 SENTENCES
treats othernoneassociated _withsymptommanifestationside
effect
treats othernoneassociated _withcontraindicatesmanifestation
treats other
http://lora-aroyo.org @laroyo 45
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every exampleAll examples are created equal: triples are triples, one is not more important than another, they are all either true or falseDisagreement bad: when people disagree, they don’t understand the problemExperts rule: knowledge is captured from domain expertsOne is enough: knowledge by a single expert is sufficientDetailed explanations help: if examples cause disagreement - add instructions Once done, forever valid: knowledge is not updated; new data not aligned with old
TIME TO DISRUPT THE COMFORT ZONE“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo
KNOWLEDGE
ACQUISITIONSTRUCTURED KNOWLEDGE ENGINEERING
http://lora-aroyo.org @laroyo
HYPER-DIMENSIONAL SPACE
R1R2R3R4R5R6R7R8R9R10R11R12
Sentence plane into a sentence vector
http://lora-aroyo.org @laroyo
DISAGREEMENT IS SIGNALVariety of sources for disagreement
http://lora-aroyo.org @laroyo
Source 1: People’s bias & perspective
DISAGREEMENT IS SIGNAL
http://lora-aroyo.org @laroyo
DISAGREEMENT IS SIGNALSource 1: Worker systematically give same answer
http://lora-aroyo.org @laroyo
DISAGREEMENT IS SIGNALSource 1: Worker systematically give same answer
http://lora-aroyo.org @laroyo
DISAGREEMENT IS SIGNALSource 1: Worker systematically give same answer
http://lora-aroyo.org @laroyo
Source 2: Target semantics
DISAGREEMENT IS SIGNAL
http://lora-aroyo.org @laroyo
SentencesSource 3: Sentences
DISAGREEMENT IS SIGNAL
http://lora-aroyo.org @laroyo
TRIANGLE OF MEANINGModel of semantic interpretation (Ogden & Richards, 1936)
http://lora-aroyo.org @laroyo
TRIANGLE OF MEANINGModel of semantic interpretation
http://lora-aroyo.org @laroyo
treats other
CrowdTruth metrics for quality assessment
TRIANGLE OF MEANING
http://lora-aroyo.org @laroyo
treats othernoneassociated _withsymptommanifestationside
effect
Among 56 subjects reporting to a clinic with symptoms
of MALARIA 53 (95%) had ordinarily effective levels
of CHLOROQUINE in blood.
Sentence ambiguity
QUALITY ASSESSMENT
http://lora-aroyo.org @laroyo
TREATS RELATIONYes or No?
treats othernoneassociated _withcontraindicatesmanifestation
treats othernoneassociated _withcontraindicatesmanifestation
treats other
http://lora-aroyo.org @laroyo
Agreement as percentage
75%25% 25% 12% 12%50%
CROWDTRUTH METRICS
For prevention of malaria, use only in individuals traveling to
malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
25% 25% 75% 12% 12% 50%
treats othernoneassociated _withcontraindicatesmanifestation
http://lora-aroyo.org @laroyo
For prevention of malaria, use only in individuals traveling to
malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
CROWDTRUTH METRICSApplying all sides of the triangle
treats other
http://lora-aroyo.org @laroyo
CROWDTRUTH METRICS
For prevention of malaria, use only in individuals traveling to
malarious areas where CHLOROQUINE resistant P.
falciparum MALARIA has not been reported.
treats other
Applying all sides of the triangle
99%
http://lora-aroyo.org @laroyo
CROWDTRUTH METRICSApplying all sides of the triangle
http://lora-aroyo.org @laroyo
For prevention of malaria, use only in individuals traveling to malarious areas where CHLOROQUINE resistant P. falciparum MALARIA has not been reported.
CROWDTRUTH KNOWLEDGE TENSOR
http://lora-aroyo.org @laroyo
CROWDTRUTH VS. EXPERTScrowd as good or better than from experts
http://lora-aroyo.org @laroyo
AMBIGUITY IMPACTS ACCURACYmore ambiguous sentences were harder to classify
http://lora-aroyo.org @laroyo
CROWDTRUTH.ORGa spatial representation of meaning that harnesses disagreement
http://lora-aroyo.org @laroyo
On the role of user-generated metadata in audio visual collections (2011). R. Gligorov, M. Hildebrand, J. van Ossenbruggen, G. Schreiber, L. Aroyo K-CAP2011
VIDEO METADATA ENRICHMENTThe Netherlands Institute for Sound and Vision
1
http://lora-aroyo.org @laroyo
DIVE+Explorative Search
2
DIVE into the event-based browsing of linked historical media (2015)
V De Boer, J Oomen, O Inel, L Aroyo, E Van Staveren, in Journal of Web Semantics:
http://lora-aroyo.org @laroyo
DEEP QA IN CULTURAL HERITAGEMauritshuis use case
3
http://lora-aroyo.org @laroyo
CROWDTRUTH IN DEPLOYMENTGoogle Maps
questions
Google Maps
reviewers
4
http://lora-aroyo.org @laroyo
CROWDTRUTH IN DEPLOYMENTGoogle Maps
emotions
mTURK crowd
5
http://lora-aroyo.org @laroyo
USER-CENTRIC DATA SCIENCEFormerly the Web & Media group
http://lora-aroyo.org @laroyo
H2020 ReTVTrans-Vector Platform (TVP)
Lora Aroyo, (coordinator) VU Amsterdam, Computer Science
Lyndon Nixon, MODUL, AT
Vasileios Mezaris, CERTH, GR
Arno Scharl, Weblyzard, AT
Bea Knecht, Zattoo, DE
Johan Oomen, Sound and Vision, NL
Nicolas Patz, Rundfunk Berlin-brandenburg, DE
http://lora-aroyo.org @laroyo
CAPTURING BIASStartimpuls for the Dutch National Science Agenda
Lora Aroyo, (coordinator) VU Amsterdam, Computer Science
Alessandro Bozzon, TU Delft CS & Delft Data Science
Alec Badenoch, Utrecht University, Media & Culture Studies
Antoaneta Dimitrova, Leiden University, Institute of Public Administration
Johan Oomen, Netherlands Institute for Sound and Vision
http://lora-aroyo.org @laroyo
CROWDTRUTH ROCKS!
Disagreement is signal
CrowdTruth is a spatial representation of meaning that harnesses disagreement
Crowds bring natural diversity
CrowdTruth defines hyper-dimensional space to represent ambiguity
Crowds help gathering real human semantics
http://lora-aroyo.org @laroyo
The world is full of shades of grey
Experts and crowds are complimentary
Capturing and understanding opinions, perspectives and contexts in the center of understanding people
TIME TO BREAK FREE
CrowdTruth defines multi-dimensional space to measure quality
Top Related