Combining Crowdsourcing and Google Street View to Identify … · 2016-03-08 · Combining...

Post on 26-Jun-2020

0 views 0 download

Transcript of Combining Crowdsourcing and Google Street View to Identify … · 2016-03-08 · Combining...

Combining Crowdsourcing and

Google Street View to Identify

Street-level Accessibility Problems

Kotaro Hara, Vicki Le, Jon E. Froehlich

Human-Computer Interaction Lab

Computer Science Department

University of Maryland, College Park

makeability lab

I want to start with a story…

You Your Friend

The problem is not just that

there are inaccessible areas of

cities, but also that there are

currently few methods for us

to determine them a priori

Can we use Google Street View to find

sidewalk accessibility problems?

Could crowdworkers perform tasks to find, label, and

assess the severity of accessibility problems?

ProjectSidewalk

Labeling Interface

Validation Interface

ProjectSidewalk

Physical Street Audits

Background and Related Work

Physical Street Audits

Physical Street Audits

Street audits are conducted by governments

and/or community organizations.

Time-consuming and expensive

Video Recording Sampson et al. 1999

Top-Down Satellite Imagery Taylor et al., 2011

Omnidirectional Streetscape Imagery Clarke et al., 2010; Rundle et al., 2011; Taylor et al., 2011; Guy & Truong, 2011

Key Point High-level of concordance

between physical audit and

Street View based audit.

Mobile Crowdsourcing SeeClickFix.com

Mobile Crowdsourcing NYC 311

Mobile Crowdsourcing NYC 311

These mobile tools can be used as complementary

techniques to our GSV approach

Problem Categories

Missing Curb Ramp

Missing Curb Ramp Object in Path Surface Problem Ending Sidewalk Other

Object in Path

Missing Curb Ramp Object in Path Surface Problem Ending Sidewalk Other

Surface Problem

Missing Curb Ramp Object in Path Surface Problem Ending Sidewalk Other

Prematurely Ending Sidewalk

Missing Curb Ramp Object in Path Surface Problem Ending Sidewalk Other

Other

Two curb ramps positioned

too close to each other

Other

Missing Curb Ramp Object in Path Surface Problem Ending Sidewalk

Research Questions

Can motivated workers identify sidewalk

accessibility problems using Street View?

Can crowd workers perform this task?

STUDY ONE: RESEARCH TEAM LABELERS

STUDY TWO: WHEELCHAIR USER LABELERS

STUDY THREE: MECHANICAL TURK LABELERS

STUDY ONE: RESEARCH TEAM LABELERS

STUDY TWO: WHEELCHAIR USER LABELERS

STUDY THREE: MECHANICAL TURK LABELERS

What accessibility problems exist in this image?

R1 R2 R3

Researcher Label Table

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

Object in Path

Curb Ramp Missing

R1 R2 R3

Researcher Label Table

Object in Path

Curb Ramp Missing

R1 R2 R3

Researcher Label Table

x2

Researcher 1

x4

Object in Path

Curb Ramp Missing

R1 R2 R3

Researcher Label Table

Researcher 2

Researcher 3

Object in Path

Curb Ramp Missing

R1 R2 R3

Researcher Label Table

x8

Researcher 1

Researcher 2

Researcher 3

There are multiple ways to examine the labels.

Object in Path

Curb Ramp Missing

R1 R2 R3

Researcher Label Table Image Level Labels

This table tells us what accessibility

problems exist in the image

Pixel Level Labels

Labeled pixels tell us where

the accessibility problems

exist in the image.

Why do we care about image level vs. pixel level?

Coarse Precise

Point Location

Level

Sub-block

Level

Block

Level (Pixel Level) (Image Level)

Coarse Precise

Point Location

Level

Sub-block

Level

Block

Level (Pixel Level) (Image Level)

Coarse Precise

Point Location

Level

Sub-block

Level

Block

Level (Pixel Level) (Image Level)

Pixel level labels could be used for

training machine learning algorithms

for detection and recognition tasks

Coarse Precise Localization

Spectrum

Point Location

Level

Sub-block

Level

Block

Level

Specification

Spectrum

Multiclass Object in Path

Curb Ramp Missing

Prematurely Ending Sidewalk

Surface Problem

Binary

Problem

No Problem

(Pixel Level) (Image Level)

Two Accessibility Problem Spectrums Different ways of thinking about accessibility problem labels in GSV

Coarse Precise

Object in Path

Curb Ramp Missing

R1 R2 R3

Researcher Label Table

Problem

Multiclass label Binary Label

Sidewalk Ending

Surface Problem

Other

Dataset

Manually curated 229 static Street View images

of sidewalks from metropolitan area (Baltimore, DC,

LA, and New York)

Dataset consists of 47 Curb Ramp Missing, 66

Object in Path, 67 Surface Problem, and 50

Prematurely Ending Sidewalk, and 50 images

with no problems

Average image age was 3.1 (SD=0.8) years old

Primary Study 1 Question

Can motivated workers provide consistent labels?

Study Method

3 researchers individually labeled

229 static images

We used Fleiss’ kappa to measure image

level binary and multiclass label agreement

between researchers

These are researcher labels

R1 R2 R3

Researcher Label Table

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

R1 R2 R3

Researcher Label Table

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

R1 R2 R3

Researcher Label Table

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

We observed moderate to substantial

agreement between researchers

Researchers have consistent perspective

towards what constitutes sidewalk

accessibility problems

Result

STUDY ONE: RESEARCH TEAM LABELERS

STUDY TWO: WHEELCHAIR USER LABELERS

STUDY THREE: MECHANICAL TURK LABELERS

Study Method

3 wheelchair users

Independently labeled 75 subset of 229 Street

View images

Think-aloud and sessions were video

recorded

30 min post-study interview

We used Fleiss’ kappa to measure agreement

between wheelchair users and researchers

Here is the recording from the study session

Result

Strong agreement between wheelchair users’

labels and researchers’ labels

Wheelchair users and motivated workers share

the similar perspective of what constitute

sidewalk accessibility problems

STUDY ONE: RESEARCH TEAM LABELERS

STUDY TWO: WHEELCHAIR USER LABELERS

STUDY THREE: MECHANICAL TURK LABELERS

We need ground truth to evaluate turkers’ tasks

Majority Vote

R1 R2 R3 Maj. Vote

Researcher Label Table

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

R1 R2 R3 Maj. Vote

Researcher Label Table

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

R1 R2 R3 Maj. Vote

Researcher Label Table

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

R1 R2 R3 Maj. Vote

Researcher Label Table

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

R1 R2 R3 Maj. Vote

Researcher Label Table

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

We took majority vote of researcher labels across

all 229 images to produce ground truth dataset

Ground Truth

Turker

Per Image Accuracy

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Correct

Correct

Correct

Wrong

This turker scored

3 out of 4

Primary Study 3.1 Question

How accurate can turkers label sidewalk

accessibility problems?

Study Method

We hired 185 turkers in total from AMT

We batched 1-10 image labeling task into 1 HIT

and paid $0.01-0.05 per HIT

We asked turkers to watch 3 min. tutorial video.

Task showed up after they finished watching first

half

Neglected Other because it was < 0.6% of the

entire label

University of Maryland: Help make our sidewalks more accessible for wheelchair users with Google Maps

Kotaro Hara

Timer: 00:07:00 of 3 hours

10 3 hours

High-level Results

81% accuracy without quality control

93% accuracy with quality control

I want to show some positive turker labeling examples

TURKER LABELING EXAMPLES

Curb Ramp Missing

TURKER LABELING EXAMPLES

Curb Ramp Missing

TURKER LABELING EXAMPLES

Object in Path

TURKER LABELING EXAMPLES

Object in Path

TURKER LABELING EXAMPLES

Prematurely Ending Sidewalk

TURKER LABELING EXAMPLES

Prematurely Ending Sidewalk

TURKER LABELING EXAMPLES

Surface Problems

Object in Path

TURKER LABELING EXAMPLES

Surface Problems

Object in Path

And now some negative examples…

TURKER LABELING ISSUES

Overlabeling Some Turkers Prone to High False Positives

No Curb Ramp

No Curb Ramp

TURKER LABELING ISSUES

Overlabeling Some Turkers Prone to High False Positives

Incorrect Object in Path label. Stop

sign is in grass.

TURKER LABELING ISSUES

Overlabeling Some Turkers Prone to High False Positives

No problems in this image

TURKER LABELING ISSUES

Overlabeling Some Turkers Prone to High False Positives

T1 T2 T3 Maj. Vote

3 Turker Majority Vote Label

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

Other

T3 provides a label of low quality

To look into the effect of turker majority vote

on accuracy, we actually had 28 turkers label

each image

28 groups of 1:

We had 28 turkers

label each image:

28 groups of 1:

We had 28 turkers

label each image:

9 groups of 3:

28 groups of 1:

We had 28 turkers

label each image:

9 groups of 3:

5 groups of 5:

28 groups of 1:

We had 28 turkers

label each image:

9 groups of 3:

5 groups of 5:

28 groups of 1:

We had 28 turkers

label each image:

9 groups of 3:

5 groups of 5:

4 groups of 7:

3 groups of 9:

Multiclass Classification

Problem Object in Path

Curb Ramp Missing

R1 R2 R3

Sidewalk Ending

Surface Problem

Researcher Maj. Vote

Turker

Correct

Correct

Correct

Wrong

Binary Classification

78.3%

83.8%

86.8% 86.6% 87.9%

50%

60%

70%

80%

90%

100%

1 turker (N=28) 3 turkers (N=9) 5 turkers (N=5) 7 turkers (N=4) 9 turkers (N=3)

Ave

rage

Imag

e-le

vel A

ccur

acy

(%)

Error bars: standard error

Image Level Accuracy

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem 4 L

ab

els

Multiclass

Accuracy

78.3%

83.8%

86.8% 86.6% 87.9%

50%

60%

70%

80%

90%

100%

1 turker (N=28) 3 turkers (N=9) 5 turkers (N=5) 7 turkers (N=4) 9 turkers (N=3)

Ave

rage

Imag

e-le

vel A

ccur

acy

(%)

Error bars: standard error

Image Level Accuracy

Multiclass

Accuracy

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem 4 L

ab

els

78.3%

83.8%

86.8% 86.6% 87.9%

50%

60%

70%

80%

90%

100%

1 turker (N=28) 3 turkers (N=9) 5 turkers (N=5) 7 turkers (N=4) 9 turkers (N=3)

Ave

rage

Imag

e-le

vel A

ccur

acy

(%)

Error bars: standard error

Image Level Accuracy

Multiclass

Accuracy

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem 4 L

ab

els

Accuracy saturates

after 5 turkers

78.3%

83.8%

86.8% 86.6% 87.9%

50%

60%

70%

80%

90%

100%

1 turker (N=28) 3 turkers (N=9) 5 turkers (N=5) 7 turkers (N=4) 9 turkers (N=3)

Ave

rage

Imag

e-le

vel A

ccur

acy

(%)

Error bars: standard error

Image Level Accuracy

Multiclass

Accuracy

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem 4 L

ab

els

Stderr: 0.2% Stderr=0.2%

78.3%

83.8%

86.8% 86.6% 87.9%

50%

60%

70%

80%

90%

100%

1 turker (N=28) 3 turkers (N=9) 5 turkers (N=5) 7 turkers (N=4) 9 turkers (N=3)

Ave

rage

Imag

e-le

vel A

ccur

acy

(%)

Error bars: standard error

Image Level Accuracy

Multiclass

Accuracy

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem 4 L

ab

els

Binary

Accuracy

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem 4 L

ab

els

78.3%

83.8%

86.8% 86.6% 87.9%

50%

60%

70%

80%

90%

100%

1 turker (N=28) 3 turkers (N=9) 5 turkers (N=5) 7 turkers (N=4) 9 turkers (N=3)

Ave

rage

Imag

e-le

vel A

ccur

acy

(%)

Error bars: standard error

Image Level Accuracy

Multiclass

Accuracy

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem 4 L

ab

els

Binary

Accuracy 1 L

ab

el Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem

4 L

ab

els

Problem

78.3%

83.8%

86.8% 86.6% 87.9%

80.6%

86.9% 89.7% 90.6% 90.2%

50%

60%

70%

80%

90%

100%

1 turker (N=28) 3 turkers (N=9) 5 turkers (N=5) 7 turkers (N=4) 9 turkers (N=3)

Ave

rage

Imag

e-le

vel A

ccur

acy

(%)

Error bars: standard error

Image Level Accuracy

Multiclass

Accuracy

Object in Path

Curb Ramp Missing

Sidewalk Ending

Surface Problem 4 L

ab

els

Binary

Accuracy 1 L

ab

el Problem

Primary Study 3.2 Question

Can turkers validate other turkers’ label to filter

out mistakes?

Validators Labelers

Kotaro Hara

Timer: 00:07:00 of 3 hours

University of Maryland: Help make our sidewalks more accessible for wheelchair users with Google Maps

3 hours 10

After quality control, accuracy increased from

81% to 93%

Limitations and Future Work

Increase Scalability

Build a automated crawler to collect images

from Street View

Allow turkers to “walk” and control camera

angle in Street View and label sidewalk

accessibility problems

Volunteer website

Computer Vision to Automate Accessibility Attribute Detection

SVM based Sliding Window Approach

Accessibility Aware Navigation System

Summary

makeability lab

81% labeling accuracy with no quality control

93% labeling accuracy with quality control

A Google Faculty Research Award

kindly sponsored this work.

Questions?

Kotaro Hara | @kotarohara_en

Victoria Le | vnle@umd.edu

Jon E. Froehlich | @jonfroehlich

makeability lab

Combining Crowdsourcing and Google Street View

to Identify Street-level Accessibility Problems

A Google Faculty Research Award

kindly sponsored this work.