Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human...

42
Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland at College Park [email protected]

Transcript of Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human...

Page 1: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Image Management Using Automatic Recognition Systems

Bongwon Suh

Computer Science Department

Human Computer Interaction Laboratory

University of Maryland at College Park

[email protected]

Page 2: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 2

Overview

Image Management Problems Thumbnail Presentation Lack of Metadata

Zoomable User Interfaces Automatic Thumbnail Cropping Semi-Automatic Photo Annotation

Page 3: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 3

Motivation Managing personal document

An ever increasing amount of personal document Sometimes, it’s easier to find information from the

Web than from your local hard disk. Tools

Microsoft Longhorn, Apple OSX Tiger Google Desktop Search

But, what about photos?But, what about photos?

Page 4: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 4

Image Management Problems Document management

Indexing, organizing, searching, browsing, sharing, and so on Extend conventional information management principles for

image management E.g. Using image captions, annotated keywords

Two additional challenges for image management

Thumbnail presentationThumbnail presentation Metadata acquisitionMetadata acquisition

Page 5: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 5

Thumbnails

Bigger thumbnails use more screen space, smaller thumbnails are hard to recognize

Two views for the same directory

Detail View Mode Preview Mode

Page 6: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 6

Lack of Metadata Metadata

Pieces of information associated with photos Digital image

A stream of color pixels Hard to extract useful metadata

When a cat sits quietly, staring off at something, and the tail twitches slowly, the

cat is concentrating on something. If a cat is lashing his tail back and forth quickly, it means

he is annoyed and angry. This is when a cat is likely to bite or scratch.

Extracting “cat”?Extracting “cat”?

Page 7: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 7

Overview

Image Management Problems Thumbnail Presentation Lack of Metadata

Zoomable User InterfacesZoomable User Interfaces Automatic Thumbnail Cropping Semi-Automatic Photo Annotation

Page 8: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 8

Zoomable User Interface 2.5 Dimension Environment

2D + Depth Zooming and panning for navigation ZUIs are dependent on humans’ ability to remember where th

ings are in space.

PhotoMesa Zoomable image browser Uses Treemap algorithm to layout photos Capable of showing thousands of images on the screen Commercialized: http://www.photomesa.com [Bederson UIST 2001]

Page 9: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 9

Overview

Image Management Problems Thumbnail Presentation Lack of Metadata

Zoomable User Interfaces

Automatic Thumbnail CroppingAutomatic Thumbnail Cropping Semi-Automatic Photo Annotation

Page 10: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 10

Thumbnail Cropping Fit images into limited screen space

Image Shrinking (Plain Thumbnail) We lose detailed information

Image Cropping We lose a part of information

Remove the periphery and show the core objects Remove the periphery and show the core objects bigger on the screenbigger on the screen Select the portion of maximal informativeness Preserve the recognizability of important objects in thumbnails

Page 11: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 11

Thumbnail Cropping Example

Original Image

Cropped Image(periphery removed)

Generated Thumbnails

Shrinking(Subsampling)

Crop first, then shrink the cropped images

Page 12: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 12

More Examples

Plain Thumbnails Cropped Thumbnails

Both sets use the same amount of screen space

Page 13: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 13

Automatic Thumbnail Cropping Which part is more informative?

Need to measure informativeness

Saliency based thumbnail cropping Improve cropping by using dynamic threshold

Face detection based thumbnail cropping Applying existing techniques as an example of detecti

ng semantic information in images

Page 14: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 14

Saliency Based Thumbnail Cropping Saliency

Visual attention model (color, intensity, and orientation) Itti and Koch (1998, 1999) Does not need prior knowledge on images

Assumption Saliency More informativeness

Computed Saliency MapOriginal Image

Page 15: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 15

Saliency Based Cropping

Find a minimum size rectangle that contains a certain portion (threshold) of total saliency Static threshold algorithm

Brute force algorithm Require exhaustive search

Greedy algorithm Keep increasing cropping bounds

Page 16: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 16

Dynamic Saliency Threshold

The most effective threshold varies from images to images

Scattered saliency Need to cut out little

Gathered saliency Close cutting is possible

Scattered saliency

Gathered saliency

Page 17: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 17

Original Image

Original Image

Area–Saliency Sum Graph Compute an optimal cropping rectangle for each saliency threshold

Cro

ppin

g A

rea

Sum of Saliency Values inside Area70% saliency

contained inside30% area

80% saliency contained inside

50% area

0.7

0.3

0.5

0.8

Page 18: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 18

Cropping With Dynamic Threshold

Area-Saliency Sum Graph

Cro

ppin

g A

rea

Sum of Saliency Values inside Area

Find a point of diminishing returnsAdding small amounts of saliency requires a large increase of the cropping bounds

Binary search for maximum gradient point

Maximum gradient

point

Page 19: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 19

Static And Dynamic Threshold

Dynamic ThresholdStatic Threshold 90%Original Image

Cutting outtoo little

Cutting outtoo much

0.9

0.9

Area-Saliency Sum Graph

: Maximum Gradient Point

Page 20: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 20

Face Detection Based Thumbnail Cropping

When semantic information of images can be detected, more efficient cropping is possible

Face Detection: Schneiderman and Kanade (2000)

OriginalImage

FaceDetection

Face Detection Based Cropping

Face Detection Based

Thumbnail

PlainThumbnail

Page 21: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 21

User Study Design Participant

Twenty students recruited on campus Task

Recognition Task Visual Search Task

Image Set Animal Set: Common objects Corbis Set: Professionally prepared photos Face Set: Well known figures (e.g., Entertainers)

Thumbnail Technique No cropping Saliency based cropping Face detection based cropping

Page 22: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 22

Recognition Task To measure the effect of thumbnail techniques on object recognition Target thumbnails were shown for two seconds Participants were asked to click what they saw. Measured recognition accuracy: # of right answers / # of total tasks# of right answers / # of total tasks

Face Set Animal Set

Page 23: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 23

Recognition Task HypothesisR

ecog

nitio

n R

ate

Thumbnail Size

100%

Effect on Recognition Rate?

Meaningful

Thumbnailsare too small

anyway

Big enough to be recognized

in both cases

Thumbnail Technique A

Thumbnail Technique B

Page 24: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 24

Recognition Task Result All curves are different from each others. (p < 0.01)

Animal Set Face Set

Face DetectionBased Cropped

Saliency Based Cropped

No Cropping

Page 25: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 25

Visual Search Task Find an image that matches a

given task description Verbal description (except

faces) PhotoMesa interface

Measured browsing completion time 3X3 within-subject ANOVA Three thumbnail techniques Three image sets

Page 26: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 26

Visual Search Task Result No cropping vs. Saliency Based

Cropping 18%, 24%, and 23% faster18%, 24%, and 23% faster,

respectively F(1, 190) = 3.82, p = 0.05

Three Thumbnail Techniques on Face Set Visual search with face

detection based thumbnails is 50% faster50% faster

F(2, 87) = 4.56, p = 0.013

Bro

wsi

ng T

ime

(sec

.)

Page 27: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 27

Overview

Image Management Problems Thumbnail Presentation Lack of Metadata

Zoomable User Interfaces Automatic Thumbnail Cropping

Semi-Automatic Photo AnnotationSemi-Automatic Photo Annotation

Page 28: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 28

Acquiring Metadata From Devices

Basic information such as date, image size, and so on Adding a GPS unit into digital camera

From Context Image from a web page: Use captions, surrounding text

Image Analysis Color, texture, face, and so on

Manual Annotation Most reliable, accurate, and relevant Slow, tedious

Automatic ExtractionInaccurate, Irrelevant

Page 29: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 29

Semi-Automatic Annotation Incremental and interactive annotation Appropriate user interfaces are important

Browse Search

Automatic Metadata

Extraction Manager

Semi-Automatic Annotation Interface

Update Knowledge

Photos ready to be annotated

Automatic Suggestion with Available Knowledge

Annotate

Relevance Feedback (Fix Errors)

Page 30: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 30

Relevant Metadata

Chronological order Last Halloween

Event information Birthday party, Camping trip Often associated with location

People in photos

[Rodden, CHI2002]

Page 31: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 31

Semi-Automatic PHoto Annotation and Recognition Interface (SAPHARI) Browse, Search, and Annotate Image clustering

Facilitate bulk annotation Using available metadata for

clustering

Event groupingEvent grouping Face groupingFace grouping

Treemap layout Zoomable user interface

Page 32: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 32

Event Based Annotation

Personal photo collection Bursty or episodic

Using pause as event boundary Event gap detection How large is the current temporal gap (compared to

neighbors)

d

djjijiii tt

dKtt )log(

12

1)log( 11 [Platt, 2001]

Page 33: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 33

Event Hierarchy

Summer Camping TripJune 13th – June 17th

William’s BirthdayJune 23rd

Family Dinner 6pm – 8pm

Party at Kinder3pm – 4pm

Santa CruzJune 16th

CanoeingJune 15th

HikingJune 14th

Require different levels of granularity

Coarse Grouping

Fine Grouping

Page 34: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 34

Event Group Example Provide multiple views for

the same photo collection Coarse grouping Fine grouping By month By directory

Fixing event group boundaries Automatically update event

group boundaries of other levels in event hierarchy

Page 35: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 35

Clothing Based Annotation Face recognition

Not applicable for personal photos Less than 50% accuracy (even state-of-the-art systems)

Face detection Identify the location of face Higher accuracy than face recognition

People usually don’t change clothing during a day Use clothing information instead of facial information

Page 36: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 36

Human Model Find an upper body part (torso) of h

uman by using face detection technique Viola-Jones face detector

Convert torso into mathematical model 4D samples (relative distance

from the neck, red, green, blue) Gaussian sampling (more weig

ht on the center line) Build 4D histogram with the sa

mples

Detected Face

Neck

Torso

Sampling with more weight on the center axis

Page 37: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 37

Compute Visual Distance

frequency

f(X)

gM1(X)

Bhattacharyya Distance

Converted ModelIdentified Clothing

Sampling Color Pixels(y-distance, r, g, b)

gMn(X)

Pre-identified Models

Four Dimensional Probability Density

Function (pdf)

Page 38: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 38

User Study Results Semi-controlled User study

Seven participants, using their own photo collections

Event annotation Event based group vs. User’s own directory 55% faster, statistically significant Valid grouping + zoomable user interface

Face annotation Clothing based group vs. Manual 15% faster (not significant) Unanimously preferred: F(2, 18) = 21.1, p < 0.01

Page 39: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 39

Clothing Based Group

As the number of faces gets larger, the semi-automatic annotation becomes more efficient.

Semi- Automatic Face Annotation (Clothing Based)

0

1

2

3

4

5

6

7

0 20 40 60 80 100

Number of Annotated Faces

Tim

e P

er A

nnot

atio

n (s

ec.)

Manual Face Annotation

0

1

2

3

4

5

6

7

0 20 40 60 80 100

Number of Annotated Faces

Tim

e P

er A

nnot

atio

n (s

ec.)

Page 40: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 40

Conclusion

PhotoMesa Scale up the size of image collection that users can

comfortably browse

Automatic Thumbnail Cropping Create better thumbnails that can fit into limited screen space

Semi-Automatic Photo Annotation Help users make accurate annotation with less effort

Page 41: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 41

Research On Other Topics

Popout Prism “Overview+Detail” Web Browser Apply visual perception principles CHI 2002

OZONE Zoomable Ontology Browser DAML, the Semantic Web AVI 2002

Page 42: Image Management Using Automatic Recognition Systems Bongwon Suh Computer Science Department Human Computer Interaction Laboratory University of Maryland.

Dec. 6th, 2004 Palo Alto Research Center - Bongwon Suh 42

Thank YouBenjamin B. Bederson

David W. JacobsHaibin Ling

Catherine Plaisant

http://www.cs.umd.edu/~sbw [email protected]