Truth discovery in crowdsourced detection of spatial...

24
Truth discovery in crowdsourced detection of spatial events Robin Wentao Ouyang Mani Srivastava Alice Toniolo Timothy J. Norman

Transcript of Truth discovery in crowdsourced detection of spatial...

Page 1: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Truth discovery in crowdsourced

detection of spatial events

Robin Wentao Ouyang

Mani Srivastava

Alice Toniolo

Timothy J. Norman

Page 2: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Mobile crowdsourced event detection

• Potholes, graffiti, bike racks, flora, …

2

Page 3: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Truth discovery

• Given crowdsourced detection reports with time and

loc tags, find which reported events are true and

which are false

3

Page 4: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Challenges

• Detection reports are non-conflicting

• Uncertainty in both participants’ reliability and

mobility

▫ Missing reports are ambiguous

• Supervision is difficult

4

Page 5: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Possible solutions

5

Severe privacy

and energy issues Trivial conclusion

Performance

degradation

Page 6: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Problem

6

Can we design an algorithm that can

reliably discover true events in mobile

crowdsourced event detection

but without location tracking and

supervision?

Page 7: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Proposed model

• Graphical model

• A participant’s likelihood of reporting an event

depends on

▫ 1) whether the participant visited the event location

▫ 2) whether the event at that location is true or false

▫ 3) how reliable the participant is

7

Location visit

indicator

Location

popularity

Participant

reliability

Event label Report

Page 8: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Proposed model

• Location popularity

▫ For each event at location

Draw the location’s popularity

8

Page 9: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Proposed model

• Participants Location visit indicators

▫ For participant and event at location

Draw a location visit indicator

• A participant has a higher chance to visit more

popular locations

9

Location visit

indicator

Location

popularity

Page 10: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Proposed model

• Event label

▫ For each event at location

Draw the event’s prior truth probability

Draw the event’s label

10

Location visit

indicator

Location

popularity

Event label

Page 11: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Proposed model

• Three-way participant reliability

▫ For each participant

Draw her true positive rate while present (TPR)

Draw her false positive rate while present (FPR)

Draw her reporting rate while absent (RRA)

• Concerns ▫ A participant’s reliability depends on: whether she visited the

event location and whether the event there is true or false

▫ A participant’s TPR and FPR may be asymmetric (reliable vs.

conservative participants)

▫ A participant must conform to physical constraints (RRA)

11

Page 12: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Proposed model

• Reports (detection = 1, missing = 0)

▫ For participant and event at location

12

TPR

FPR

RRA

Location visit

indicator

Location

popularity

Participant

reliability

Event label Report

Page 13: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Analysis

• 1) Missing reports are well explained

• When location popularity , we have

• When location popularity , we have

13

Event label &

participants’

TPR/FPR

Limited mobility &

participants’ RRA

Page 14: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Analysis

• 2) Location tracking is avoided.

▫ Location popularity is a collective rather than a

personal measure.

▫ Its prior counts need to be estimated only once.

▫ It can be jointly learned with other parameters

from data.

• 3) Different aspects of participant reliability are

handled.

• 4) Prior belief can be easily incorporated.

14

Page 15: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Experiments

• Methods in comparison ▫ MV (majority voting)

▫ TF (truth finder [1])

▫ GLAD (generative model of labels, abilities, and difficulties [2])

▫ LTM (latent truth model [3])

▫ EM (expectation maximization [4])

▫ TSE (truth finder for spatial events) – proposed

• [1] X. Yin, J. Han, and P. S. Yu. Truth discovery with multiple conflicting information providers on

the web. IEEE TKDE, 20(6):796–808, 2008.

• [2] J. Whitehill et al. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In NIPS, pages 2035–2043, 2009.

• [3] B. Zhao et al. A bayesian approach to discovering truth from conflicting sources for data integration. VLDB Endowment, 5(6):550–561, 2012.

• [4] D. Wang et al. On truth discovery in social sensing: a maximum likelihood estimation approach. In IPSN, pages 233–244. ACM, 2012.

15

Page 16: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Experiments

• Traffic light detection • A mobility dataset

containing time-stamped

GPS location traces for

536 taxicabs in SF ▫ Spatial area of interest

3.5km x 4.4km

– further divided into two

subareas

▫ Temporal span 25 days

• Detection reports ▫ A participant waits for

15-120 seconds

16

Page 17: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Experiments

• Traffic light detection

17

Page 18: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Experiments

• Traffic light detection (Area 2)

18

Page 19: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Experiments

• Image-based event detection

19

Page 20: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Experiments

• Simulation (F1 score on event labels)

20

Page 21: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Experiments

• Simulation (MAE on TPRs a and FPRs b)

21

Page 22: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Discussion

• Sequential mobility modeling

• Dependent sources

• Cross-domain truth discovery

22

Page 23: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Conclusion

• Our proposed model integrates location popularity,

location visit indicators, truth of events and three-

way participant reliability in a unified framework.

• It can efficiently handling both unknown participants’

reliability and mobility.

• It can efficiently discover true events in mobile

crowdsourced event detection without any

supervision and location tracking.

23

Page 24: Truth discovery in crowdsourced detection of spatial eventswouyang/slides/Truth_discovery_slides.pdf · location visit indicators, truth of events and three-way participant reliability

Q & A

24