Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC)...

10
Every Picture Tells a Story… Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26

description

Sample resources and outputs Training Data Query Output A plane flying in sky. A plane taking in airport. A plane taking in sky. A plane flying in airport. A plant flying in sky. Bike resting on tree. A strange bicycle with four handlebars which has been locked to a tree trunk. There is some sort of odd bike with two sets of handlebars secured to a tree. Bluegreen bicycle locked to a tree on a cement sidewalk. A diffrent cycle is alone

Transcript of Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC)...

Page 1: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

Every Picture Tells a Story…

Amin Sadeghi, Mohsen Hejrati (IPM)Ali Farhadi, David Forsyth (UIUC)

1388/1/26

Page 2: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

The Problem…

• Input an Image• Output a meaningful descriptive sentence

Available Resources:

• 1000 Pascal images each with 5 sentences• A large English Corpus (not used yet)

Page 3: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

Sample resources and outputs

• Training Data • Query • Output

A plane flying in sky.A plane taking in airport.A plane taking in sky.A plane flying in airport.A plant flying in sky.

•Bike resting on tree.

•A strange bicycle with four handlebars which has been locked to a tree trunk.

•There is some sort of odd bike with two sets of handlebars secured to a tree.

•Bluegreen bicycle locked to a tree on a cement sidewalk.

•A diffrent cycle is alone

Page 4: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

Our proposed model

• Mapping from Image Space to Meaning Space

• Mapping from Sentence Space to Meaning Space

• Matching Sentences with Images

Page 5: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

Sentence Space Meaning Space• Extract a sample subject, verb and scene from the sentences

in the training data (to be used for training)

Subject: CatVerb: SittingScene: room

black cat over pink chairA black color cat sitting on chair in a room.cat sitting on a chair looking in a mirror.

Vehicle

Car TrainBike

HumanAnimal

Cat HorseDog

Object

• Use taxonomy trees

Page 6: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

Image Space Meaning Space

• Build composite feature vectors – Use the output of a bunch of object/scene detectors to

build a feature vector for each training image– Include Name, Position, rate, …

• Learn words given these vectors – Use SVM or AdaBoost to learn words given composite

feature vectors – learn subjects, verbs and scenes– Learn 100 words using the output for 40 object detectors

and scene descriptors– Sometimes not enough examples to learn

Page 7: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

Image Space Meaning Space

2. Learn words given these vectors – Use SVM or AdaBoost to learn words given feature

libraries– learn subjects, verbs and scenes– Sometimes not enough examples to learn

SVMAdaBoost

Page 8: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

Image Space Meaning Space

Page 9: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

How to measure the Accuracy?

• Qualitatively or Quantitatively??کیفی؟؟ یا کمی

• We need two Quantitative measures:• One for the soundness of the sentence itself• One for the descriptive value of words alone

Page 10: Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC) 1388/1/26.

Results…