What makes an image memorable? - MITweb.mit.edu/phillipi/www/posters/ImageMemorability VSS... ·...

1
Phillip Isola, Jianxiong Xiao, Antonio Torralba, Aude Oliva, MIT What makes an image memorable? Questions Are some images consistently more memorable than others, even across different observers and contexts? Intrinsic memorability? Can we predict it? Can we automatically predict an image’s memorability from its features? Memory Game What image content is driving memorability? What image content matters? ... Vigilance repeat Memory repeat 100 1-7 back 91-109 back time + + + + + + 1 sec 1.4 sec What classes of information predict memorability? What content makes an image memorable? Inter-subject consistency The memorability we measured is a stable property of an image that is shared across many observers and contexts. Sizeable effect Some images are consistently much more memorable than others. Database: 10442 photographs sampled from SUN database (Xiao et al. 2010). 2222 target images scored. Images annotated with LabelMe (Russell et al. 2008). Memorability = probability of correctly detecting a repeat after a single view of an image in a long stream. ? ? 665 participants on Amazon’s Mechanical Turk. Intrinsic memorability? ï SHUVRQ VLWWLQJ person floor VHDWV car EXLOGLQJ ï FHLOLQJ ï WUHH ï sky ï PRXQWDLQ ï 7RS REMHFWV %RWWRP REMHFWV Prediction algorithm: SVM Regression with non-linear kernels on the following feature sets. Object score = (prediction when object included in image’s feature vector) - (prediction when object removed) Objects shaded according to object score (computed per image) Common objects ranked according to object score (averaged across images) Common scenes ranked according to their average memorability 200 600 1000 1400 1800 2200 40% 50% 60% 70% 80% 90% 100% Image rank N, according to specified group Average % memorability, according to Group 1, of 25 images centered about rank N Group 1 Group 2 Chance 0 20 40 60 80 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Mean number of scores per image rank corr (l) l = 0.75 Rich features necessary Simple features not enough. Multivariate global features do better. Semantic annotations do best. Predictable from image Predicted memorable Predicted average Predicted forgettable QDWXUDO ODNH ERWDQLFDO JDUGHQ EURDGOHDI IRUHVW DUW VWXGLR EDWKURRP EDNHU\ VKRS This work is in press as Isola, Xiao, Torralba, and Oliva, CVPR 2011, and online at web.mit.edu/phillipi/Public/WhatMakesAnImageMemorable 3) Object segmentation statistics? histograms of object segments counts, sizes, and rough positions l = 0.20 GIST SIFT HOG SSIM Pixels 2) Global image features? pixel histograms, GIST, SIFT, HOG, SSIM l = 0.46 “Aquarium” 4) Object and scene semantics? number, size, and rough position of each object class; scene category l = 0.50 1) Simple image stats? e.g. mean hue, brightness, number of objects l = 0.16 Predicted rank N Average memorability for top N ranked images (%) Global features and annotations Other humans Chance 100 300 500 700 900 1100 70 75 80 85 l = 0.75 l = 0.54 Memorability (hit rate) Mean: 67.5% SD: 13.6% False alarm rate Mean: 10.7% SD: 7.6% Memorable Hit rate: 67/70 False alarm rate: 4/80 Average Hit rate: 59/81 False alarm rate: 7/92 Forgettable Hit rate: 21/68 False alarm rate: 3/82 Conclusions People, close-ups, and human-scale objects predict memorability Predictable from rich image features Stable intrinsic memorability

Transcript of What makes an image memorable? - MITweb.mit.edu/phillipi/www/posters/ImageMemorability VSS... ·...

Page 1: What makes an image memorable? - MITweb.mit.edu/phillipi/www/posters/ImageMemorability VSS... · 2011-05-03 · Phillip Isola, Jianxiong Xiao, Antonio Torralba, Aude Oliva, MIT What

Phillip Isola, Jianxiong Xiao, Antonio Torralba, Aude Oliva, MIT

What makes an image memorable?

Questions

Are some images consistently more memorable than others, even across different observers and contexts?

Intrinsic memorability?

Can we predict it?Can we automatically predict an image’s memorability from its features?

Memory Game

What image content is driving memorability?

What image content matters?

...Vigilance repeat

Memory repeat

100

1-7 back

91-109 back time

+ + + + + +1 sec 1.4 sec

What classes of information predict memorability?

What content makes an image memorable?

Inter-subject consistencyThe memorability we measured is a stable property of an image that is shared across many observers and

contexts.

Sizeable effectSome images are consistently much

more memorable than others.

Database: 10442 photographs sampled from SUN database (Xiao et al. 2010). 2222 target images scored. Images annotated with LabelMe (Russell et al. 2008).

Memorability = probability of correctly detecting a repeat after a single view of an image in a long stream.

? ?

665 participants on Amazon’s Mechanical Turk.

Intrinsic memorability?

person floor car

sky

Prediction algorithm: SVM Regression with non-linear kernels on the following feature sets.

Object score = (prediction when object included in image’s feature vector) - (prediction when object removed)

Objects shaded according to object score (computed per

image)

Common objects ranked according

to object score (averaged

across images)

Common scenes ranked according to their average

memorability

200 600 1000 1400 1800 2200

40%

50%

60%

70%

80%

90%

100%

Image rank N, according to specified group

Average % memorability, according to

Group 1, of 25 images

centered about rank N

Group 1

Group 2

Chance

0 20 40 60 80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Mean number of scores per image

rank corr( )

= 0.75

Rich features necessary

Simple features not enough. Multivariate global features do better.

Semantic annotations do best.

Predictable from image

Predicted memorable

Predicted average

Predicted forgettable

This work is in press as Isola, Xiao, Torralba, and Oliva, CVPR 2011, and online at web.mit.edu/phillipi/Public/WhatMakesAnImageMemorable

3) Object segmentation statistics?

histograms of object segments counts, sizes, and rough positions

= 0.20

GIST

SIFT HOG SSIM

Pixels

2) Global image features?pixel histograms, GIST, SIFT, HOG, SSIM

= 0.46

“Aquarium” 4) Object and scene semantics?

number, size, and rough position of each object class; scene category

= 0.50

1) Simple image stats?e.g. mean hue, brightness, number of objects

= 0.16

Predicted rank N

Average memorability

for top N ranked

images (%)

Global features and annotations

Other humans

Chance

100 300 500 700 900 1100

70

75

80

85 = 0.75

= 0.54

Memorability (hit rate) Mean: 67.5% SD: 13.6%

False alarm rate Mean: 10.7% SD: 7.6%

MemorableHit rate: 67/70False alarm rate: 4/80

AverageHit rate: 59/81False alarm rate: 7/92

ForgettableHit rate: 21/68False alarm rate: 3/82 Conclusions

People, close-ups, and human-scale objects predict memorability

Predictable from rich image features

Stable intrinsic memorability