Carnegie Mellon University Pittsburgh,...

30
1 TsuhanChen2004 From Low-Level Features to High-Level Semantics: Are We Bridging the Gap? Tsuhan Chen Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval (IR) Like looking for a needle in a hay stack…

Transcript of Carnegie Mellon University Pittsburgh,...

Page 1: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

1

TsuhanChen2004

From Low-Level Features to High-Level Semantics:Are We Bridging the Gap?

Tsuhan ChenCarnegie Mellon University

Pittsburgh, USA

TsuhanChen2004

Informational Retrieval (IR)

Like looking for a needle in a hay stack…

Page 2: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

2

TsuhanChen2004

Informational Retrieval (IR)

TsuhanChen2004

Content-Based Information Retrieval (CBIR)

Many Interesting Applications…

Page 3: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

3

TsuhanChen2004

Query

Retrieved Trademarks

Trademark Retrieval

TsuhanChen2004

Hand-Drawn Query

Retrieved Trademarks

Trademark Retrieval

Page 4: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

4

TsuhanChen2004

Trademark Retrieval

TsuhanChen2004

Sketch Retrieval

User sketches a query

QuerySketch

SimilarSketch

Page stored in Database

Page 5: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

5

TsuhanChen2004

3D Object Retrieval

TsuhanChen2004

3D Object Retrieval

Page 6: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

6

TsuhanChen2004

3D Protein Retrieval

TsuhanChen2004

Some Basics of CBIR…

Page 7: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

7

TsuhanChen2004

CBIR – Data Types

MultimediaDatabase

Data types: Text databaseAudio databaseImage/video databaseSketch/ink database3D object database

TsuhanChen2004

CBIR – Low-level Feature Extraction

MultimediaDatabase f1

f2

Low-LevelFeature Space

FeatureExtraction

Example low-level features: Text: keyword frequencyAudio: pitch contour, frequency spectrumImage: color histogram, wavelet coeffsVideo: audio feature + image feature, motionSketches/ink data: shape descriptors3D objects: aspect ratios, moments

Page 8: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

8

TsuhanChen2004

L

d

Ld

p

A

pAπ2

A

hullconvex AreaA LL

line circle polygon

Low-Level Features: Sketches

sides#

TsuhanChen2004

Low-Level Features: 3D Objects

Volume-surface ratio

Aspect ratiossqrt(Y2/X2), sqrt(Z2/X2)

Moment invariantsMxiyjzk: M200, M210, M102, M021, …

Fourier transform coefficients

A (x1, y1, z1)

B (x2, y2, z2)

C (x3, y3, z3)O

x

y

z

NACB

Page 9: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

9

TsuhanChen2004

CBIR – Indexing and Query

MultimediaDatabase f1

f2

Low-LevelFeature Space

FeatureExtraction

Query

FeatureExtraction

⎥⎦

⎤⎢⎣

2

1

ff

IndexingIndexedFeature

Database

RetrievalResults

Similarity

Measure

TsuhanChen2004

Semantic Gap…

Page 10: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

10

TsuhanChen2004

High-Level Semanticse.g. hierarchical attribute tree

Architecture Anatomy

Buildings

(133) (73)

Building_Materials/Misc

LandmarksBody-parts

Children

Female MaleMonsters/Androids/

Aliens

(32)

(87)

(14)(47)

(1)

(6) (8)

(12)

BIG CHALLENGE: How to bridge the gap

between low-level features and high-level semantics?

TsuhanChen2004

Possible Solutions

MultimediaDatabase f1

f2

Low-levelFeature space

FeatureExtraction

Query

FeatureExtraction

⎥⎦

⎤⎢⎣

2

1

ff

IndexingIndexedFeature

Database

RetrievalResults

Similarity

Measure

HiddenAnnotation

RelevanceFeedback

Page 11: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

11

TsuhanChen2004

Semantic Information Hidden annotation

Object #n has Attribute #k explicit

Relevance feedback Objects #m and #n are (not) similar implicit

Q: How to represent and propagate semantic information?

Q: How to use explicit and implicit semantic information to improve retrieval?

TsuhanChen2004

Hidden Annotation

Page 12: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

12

TsuhanChen2004

Hidden AnnotationAnnotation

Add high-level semantic features manually“Hidden”: transparent to the user

Complete annotation is impractical

Select only “some” objects to annotateWhich objects should be annotated first?

TsuhanChen2004

Semantic Information

pNK…pN3pN2pN1

Object

N

p2K…p23p22p21

Object

2

p1K…p13p12p11

Object

1

Attribute

K…

Attribute

3

Attribute

2

Attribute

1

…… … … …

pnk : Attribute Probabilities

Page 13: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

13

TsuhanChen2004

Object

N

0101Object

2

Object

1

Attribute

K…

Attribute

3

Attribute

2

Attribute

1

…… … … …

Q: How to propagate?When an object is annotated, pnk is set to 0/1

Annotate one object…

A: Connect to low-level features

TsuhanChen2004

Semantic Propagation

Prior

Annotated Objects

Low-Level Feature

Probability

“Biased Kernel Regression”

Page 14: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

14

TsuhanChen2004

Semantic Propagation (cont.)

pprior

f1f2

pi k

pprior

f1 f2

pi k

TsuhanChen2004

pprior

f1 f2

pi k

Semantic Propagation (cont.)

Page 15: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

15

TsuhanChen2004

Q: Which to annotate next?

Semantic Propagation (cont.)

pNK…pN3pN2pN1

Object

N

0101Object

2

p1K…p13p12p11

Object

1

Attribute

K…

Attribute

3

Attribute

2

Attribute

1

…… … … …

TsuhanChen2004

Active Learning

Choose the most uncertain object to annotateUncertainty determined by the entropy of attribute probabilities

“Selective sampling”May want to consider density in feature space too

Teacher Student

Teacher Student

Annotator Retrieval System

Passive Learning

Active Learning

Page 16: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

16

TsuhanChen2004

UncertaintyEntropy

For object i having attribute k

Entropy= 0 once annotated

Overall uncertainty

More general attributes have higher weights

)1log()1(log ikikikikik ppppE −−−−=

∑=

=K

kikki EwU

1

TsuhanChen2004

Recap…Maintain attribute probabilities of each model

Set an attribute probability to 1/0 when annotated

Estimate probabilities of non-annotated objects

Use probabilities to estimate uncertainty

Choose the most uncertain object in the database to annotate

Use probabilities to measure semantic distance…

Page 17: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

17

TsuhanChen2004

To use semantic information…High-level semantic distance

Probability that two objects disagree with each other for attribute k

Low-level feature distance

Overall distance

[ ]∑=

−+−=K

kkkkklevelS ppppwd

1122112 )1()1(

( )∑=

−=J

jjjLjL ffwd

1

22112

121212 LLSSOverall dwdwd +=

TsuhanChen2004

Result

0

0.1

0.2

0.3

0.4

0.5

0 300 600 900 1200 1500 1800

Number of Annotated Models

Ave

rage

Mat

chin

g E

rror

(Err

)

Random Sampling

Our algorithm

3D Objects(1750 total)

Page 18: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

18

TsuhanChen2004

Relevance Feedback

TsuhanChen2004

Relevance FeedbackRelevance feedback

Ask for user’s feedback during the retrieval“Object #i is (not) similar to the query”“Objects #m and #n are (not) similar”

Implicit semantic information

Use feedback to improve retrievalMove the query pointWeigh the features“Warp” the feature space

Page 19: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

19

TsuhanChen2004

An Example

Retrieved Results(inside circle)

Query

1f

2f

TsuhanChen2004

User Feedback

Retrieved Results(inside circle)

Query

1f

2f

Positive feedbackNegative feedback

Page 20: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

20

TsuhanChen2004

Move the Query Point

New Retrieved Results(inside circle)

New Query

1f

2f

Positive feedbackNegative feedback

TsuhanChen2004

Feature Weighting

New Retrieved Results(inside ellips)

Query

1f

2f

Positive feedbackNegative feedback

Page 21: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

21

TsuhanChen2004

Feature Space Warping

New Retrieved Results(inside circle)

Query

'1f

'2f

Positive feedbackNegative feedback

Retrieved Results(inside circle)

Query

1f

2f

Positive feedbackNegative feedback

Before Warping After Warping

[Bang and Chen, 2002]

TsuhanChen2004

Feature Space Warping

Feature Space

Query

PositiveFeedback

NegativeFeedback

( ) iq

M

jijipi vvcuv ⎥⎦

⎤⎢⎣

⎡−= ∑

=1expγ This is also semantic propagation!!!

Page 22: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

22

TsuhanChen2004

Experiment ResultD a ta b a s e P e rfo rm a n c e I n c re a se w i th

G a m m a = 0 .3 , c = 6*p i

0

1 0

2 0

3 0

4 0

5 0

6 0

7 0

8 0

9 0

1 0 0

0 1 2 3 4

n u m b e r o f fe e d b a c k i te ra tio n s

% p

erfo

rman

ce in

crea

se

T= 3

T= 7

T= 1 2

Performance Improvement

TsuhanChen2004

What if no feature space?

Page 23: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

23

TsuhanChen2004

Metric Model

MultimediaDatabase

Query

O1

O2

O3

ON

S1

S2

S3

SN

RetrievalResults

RelevanceFeedback

Hidden Annotation

No specific feature space…

TsuhanChen2004

Metric Model

1O 2O

3O4O

12s

13s

34s

24s23s14s

1O 2O

3O4O

↑12s

?13s

↑34s

?24s↓23s↓14s

Page 24: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

24

TsuhanChen2004

Representation in Matrix Form

1O 2O

3O4O

12s

13s

34s

24s23s14s

1O 2O

3O4O

↑12s

?13s

↑34s

?24s↓23s↓14s

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

−−

−−

=

1101111001111011

)0(F

O1 O2 O3 O4O1

O2

O3

O4

“Feedback Matrix”

TsuhanChen2004

Indirect Semantic Links

O1 and O2 relevantO1 and O4 irrelevant

1O 2O

3O4O

12s

13s

34s

24s23s14s

1O 2O

3O4O

↑12s

?13s

↑34s

?24s↓23s↓14s

O2 and O4 are irrelevant

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

−−−−

−−−−

=⊗=

2222222222222222

)0()0()0( FFΓ Semantic Propagation!!!

Page 25: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

25

TsuhanChen2004

Experiment Result

0.00%

10.00%

20.00%

30.00%

40.00%

1 2 3

Without Propagation

With Propagation

Rounds of Feedback

PerformanceImprovement

Logo Database50 objects

5 categories

TsuhanChen2004

Semantic PropagationBoth hidden annotation and relevance feedback can propagate semantics

Without semantic propagation, hidden annotation and relevance feedback are trivial and not very useful

With enough relevance feedback, can we can accomplish information retrieval without low-level features at all?

Page 26: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

26

TsuhanChen2004

Content-Free Information Retrieval

TsuhanChen2004

Content-Free Information Retrieval (CFIR)

With enough relevance feedback, retrieval is based more and more on feedback, less and less on featuresIn the extreme case, retrieval based on feedback only

Retrieval based on user history

e.g., Amazon.com

Page 27: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

27

TsuhanChen2004

Example -- How CFIR Works

1001

?110

0110

User History

TsuhanChen2004

1001

0110

0110

User History

Example -- How CFIR Works

and are more similar than and

Page 28: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

28

TsuhanChen2004

CBIR vs. CFIRWill user U like image X ?

Two different approaches:

Look at what U likes

Characterize images Content-based IR

Look at which users like X

Characterize users Content-free IR

TsuhanChen2004

One CFIR Method

Retrieve based on

11

)1(

)1|1(

)1,...,1|1(1

−=

=

==Π∝

===

Fi

ji

F

k

jji

xP

xxP

xxxP

k

F

Pair-wise conditional probability matrix

jixxP ji ,)1|1(~∀==

User History

Page 29: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

29

TsuhanChen2004

Experiment Results

20 40 60 80 1000

0.5

1

1.5

2

2.5

3

Recall (%)

Prec

isio

n (%

)

Inverse varianceOne-class SVMBayesian product rule

Random

20 40 60 80 10010

20

30

40

50

60

70

80

Recall (%)

Prec

isio

n (%

)

Product rule Max entropy Sum rule

CBIR Methods CFIR Methods

Sample Images(1000 total)

TsuhanChen2004

SummaryNeed to bridge the gap between low-level features and high-level semantics

Hidden annotation and relevance feedback can help

Semantic propagation is important

Relevance feedback can be done with or without feature space

Content-free information retrieval is possible

Page 30: Carnegie Mellon University Pittsburgh, USAchenlab.ece.cornell.edu/Publication/Tsuhan/20041125EWIMT...Carnegie Mellon University Pittsburgh, USA TsuhanChen2004 Informational Retrieval

30

TsuhanChen2004

Afterthoughts…Feng-Shui (風水)

Ancient Chinese room arrangement technique

Way 1 (low-level):Write down all the rulesToo many and do not generalize

Way 2 (high-level):Imagine how a dragon would move through the room to arrange it in a livable mannerIntuitive and creativeDone by some Feng-Shui masters

TsuhanChen2004

Advanced Multimedia Processing Lab

Please visit us at:

http://amp.ece.cmu.edu