"Assistive Technology for the Visually Impaired," a Presentation from UC Santa Cruz

27
Computer Engineering Computer Engineering (Computer) Vision Without Sight Roberto Manduchi Computer Engineering UC Santa Cruz

Transcript of "Assistive Technology for the Visually Impaired," a Presentation from UC Santa Cruz

Computer Engineering Computer Engineering

(Computer) Vision Without Sight

Roberto Manduchi

Computer Engineering

UC Santa Cruz

Research at JPL (circa 2000)

Research at JPL (circa 2000)

Sensors

Can we go from here …

Can we go from here …

… to here?

Services that “provide equipment or

systems, standardized or individualized,

whose aim is to improve or maintain the

functional capabilities of individuals with

disabilities”.

M.J. Fuhrer, NIH

Assistive Technology:

1. You cannot drive your car

2. You cannot read the paper

3. You may trip over an obstacle

4. You may miss a sign far away

5. You may not be able to cross a street safely

6. You may not find what you are looking for at the supermarket

7. You may get lost in a new place

8. You may not receive a proper education

9. You may have problems finding a job

10. You may not recognize friends from a distance

11. You may lose objects in your home

12. You may have problems surfing the Web

13. You may not know who is in the room

14. You may not be able to read this line

If you cannot see well...

Success stories

Screen readers

Screen magnifiers

Braille interfaces

Enlargers/telescopes

Success stories

Accessible GPS

Money reader

Object recognition

OCR

The promise of crowdsourcing

Volunteer remote helpers using

FaceTime from an iPhone

Trained agents using video

stream and ancillary data from

Google Glass View from the agent’s

dashboard

The process of navigating through an

environment and traveling to places by

relatively direct paths.

R.G. Long, E.W. Hill

Finding the way is not a gift or a innate

ability… it is a precondition for life itself.

Knowing where I am, my location, is the

precondition for knowing where I have to go,

wherever it may be.

Otl Aicher

Wayfinding:

• Prior information

–Maps

–Verbal directions

• Path integration

–Continuous update of egocentric coordinates of

starting location

–Remembering the path traversed, turns, etc.

• Piloting

–Sensing one’s positional information to determine

one’s location

–Reading signs

–Noticing landmarks (acoustic, tactile, smells, heat…)

Wayfinding for sighted people

• Prior information

–Maps (tactile)

–Verbal directions

• Path integration

–Continuous update of egocentric coordinates of

starting location

–Counting steps, turns, etc.

• Piloting

–Sensing one’s positional information to determine

one’s location

–Reading signs (Braille)

–Noticing landmarks (acoustic, tactile, smells, heat…)

Wayfinding for blind people

GPS is only a partial solution

• Works only outdoors

• ~10m resolution

Will take you to locations

and won’t get you lost, but…

…where is the door?

• Tactile paving

• Accessible pedestrian signals

• Light beaconing (Talking Signs)

• RFID, iBeacons

• iBeacons

Supporting Infrastructure

Computer vision to the rescue:

Geometric reconstruction

Line-based geometry reconstruction from 2 views

Line-based SLAM

Line-based SLAM

Computer vision to the rescue:

Text spotting/reading

MSER Clustering

Pruning

Resizing

Binarization

CNN

Line

grouping

INPUT

OUTPUT Heat map

Heat map

Compound

patches

Text spotting pipeline

Results on ICDAR dataset

Open Problems

• How can we communicate spatial

information using non-visual interfaces?

• How to operate a camera to

find/recognize a target without

sight?

Mobile OCR works well…

…when you can take a

good picture of the document!

Example: Mobile OCR

E.g. OrCam: user indicates the region to process

by pointing it with their finger

Accessible Mobile OCR

• Accessible OCR apps often offer an

opportunistic discovery feature

–The system analyzes the video stream

from the camera

–When a “good” frame is detected, this is

passed on to OCR

Prizmo Text Detective

Guided Mobile OCR

• Real-time text spotting and line detection

• Computes whether current frame is

OCR-readable (enough resolution,

enough margin)

• If not, produces guidance instructions

(‘up’, ‘left’,…)

• Captures a high-resolution image for

OCR processing

Guided OCR - results

• Without system assistance or prior training, it

can be extremely difficult to acquire

readable images

• Guidance is more efficient than

opportunistic discovery

• By using our guidance app, our participants

learnt to take better pictures – even without

assistance!

Thank you!

[email protected]