Exploiting disagreement through open ended tasks for capturing interpretation spaces

Post on 12-Apr-2017

2.626 views 0 download

Transcript of Exploiting disagreement through open ended tasks for capturing interpretation spaces

Exploiting disagreementthrough open-ended tasks for

capturing interpretation spaces

Doctoral Consortium

By / Benjamin Timmermans @8w

OutlineIntroductionState of the ArtProblem StatementMethodologyPreliminary ResultsConclusions

Introduction

How many dogs were in the picture?

There is no universal "truth"

For the training, testing and evaluationof machines we rely on a...

ground "truth"

State of the Art

Crowdsourcing Approach1-3 annotatorsEvaluate workersInner-annotator agreementUse test questionsPredefined answer choices

The CrowdTruth Approach10-15 annotatorsEvaluate the input, annotations and workersDisagreement-based analytics

Problem Statement

Problems with multimedia annotationsAre sparseAre homogeneousDo not represent everything that can be heard or seen

Problems with crowdsourcing tasksAre designed to stimulate agreementAssumes answers are right or wrong

Closed task

How many beams do you see?

1 2 3 4 5

1 1 2 3 4 5

5 5

Open-ended tasks

How many beams do you see?

Gathering the interpretation space of multimedia through open-ended crowdsourcing tasks

Goal

More efficient crowdsourcingHigher quality ground truth dataImproved search and discovery of multimedia

Are open-ended crowdsourcing tasks a feasible method forcapturing the interpretation space of multimedia?

Research Question

Methodology

1. Improving quality evaluationComparing Closed and open-ended tasksMeasure worker confidence

2. Improving open-ended task designCombine constrains with open-ended designsShowing known annotationsDetecting the distribution of answers

3. Applying the ground "truth"Compare different contextsImprove indexing of multimedia

Preliminary Results

Gathering training datafor IBM Watson

Range of tasksPassage JustificationPassage AlignmentDistributional disambiguation

Sound Interpretations

2.133 short soundsTop 5000 search terms = 11 mil. searches

Sound tag overlap

ConclusionsThere is no ultimate "truth"Do not stimulate agreementCapture the interpretation spaceUse open-ended crowdsourcing tasksEvaluation more difficult

Who we are

Lora Aroyo

Robert-Jan Sips

Chris Welty

Oana Inel

Anca Dumitrache

Benjamin

Timmermans

AcknowledgementsSupervisor: Dr. Lora AroyoMentor: Dr. Matteo Palmonari

CrowdTruth.org

Ben jamin Timmermans

btimmermans.comb.timmermans@vu.nl

 @8w