Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC)...
-
Upload
victor-watts -
Category
Documents
-
view
229 -
download
0
description
Transcript of Every Picture Tells a Story Amin Sadeghi, Mohsen Hejrati (IPM) Ali Farhadi, David Forsyth (UIUC)...
Every Picture Tells a Story…
Amin Sadeghi, Mohsen Hejrati (IPM)Ali Farhadi, David Forsyth (UIUC)
1388/1/26
The Problem…
• Input an Image• Output a meaningful descriptive sentence
Available Resources:
• 1000 Pascal images each with 5 sentences• A large English Corpus (not used yet)
Sample resources and outputs
• Training Data • Query • Output
A plane flying in sky.A plane taking in airport.A plane taking in sky.A plane flying in airport.A plant flying in sky.
•Bike resting on tree.
•A strange bicycle with four handlebars which has been locked to a tree trunk.
•There is some sort of odd bike with two sets of handlebars secured to a tree.
•Bluegreen bicycle locked to a tree on a cement sidewalk.
•A diffrent cycle is alone
Our proposed model
• Mapping from Image Space to Meaning Space
• Mapping from Sentence Space to Meaning Space
• Matching Sentences with Images
Sentence Space Meaning Space• Extract a sample subject, verb and scene from the sentences
in the training data (to be used for training)
Subject: CatVerb: SittingScene: room
black cat over pink chairA black color cat sitting on chair in a room.cat sitting on a chair looking in a mirror.
Vehicle
Car TrainBike
HumanAnimal
Cat HorseDog
Object
• Use taxonomy trees
Image Space Meaning Space
• Build composite feature vectors – Use the output of a bunch of object/scene detectors to
build a feature vector for each training image– Include Name, Position, rate, …
• Learn words given these vectors – Use SVM or AdaBoost to learn words given composite
feature vectors – learn subjects, verbs and scenes– Learn 100 words using the output for 40 object detectors
and scene descriptors– Sometimes not enough examples to learn
Image Space Meaning Space
2. Learn words given these vectors – Use SVM or AdaBoost to learn words given feature
libraries– learn subjects, verbs and scenes– Sometimes not enough examples to learn
SVMAdaBoost
Image Space Meaning Space
How to measure the Accuracy?
• Qualitatively or Quantitatively??کیفی؟؟ یا کمی
• We need two Quantitative measures:• One for the soundness of the sentence itself• One for the descriptive value of words alone
Results…