Content-based Image Retrieval (CBIR)

127
1 Content-based Image Retrieval (CBIR) Searching a large database for images that match a query: What kinds of databases? What kinds of queries? What constitutes a match? How do we make such searches efficient?

description

Content-based Image Retrieval (CBIR). Searching a large database for images that match a query: What kinds of databases? What kinds of queries? What constitutes a match? How do we make such searches efficient?. Applications. Art Collections e.g. Museum s, galleries, Archives - PowerPoint PPT Presentation

Transcript of Content-based Image Retrieval (CBIR)

Page 1: Content-based Image Retrieval (CBIR)

1

Content-based Image Retrieval (CBIR)

Searching a large database for images that match a query:

What kinds of databases? What kinds of queries? What constitutes a match? How do we make such searches

efficient?

Page 2: Content-based Image Retrieval (CBIR)

2

Applications

Art Collections e.g. Museums, galleries, Archives Medical Image Databases CT, MRI, Ultrasound, The Visible Human Scientific Databases e.g. Earth Sciences, Hubble General Image Collections for Licensing Corbis, Getty Images, photo galery of

Gulhasan The World Wide Web

Page 3: Content-based Image Retrieval (CBIR)

3

What is a query?

an image you already have a rough sketch you draw a symbolic description of what you want e.g. an image of a man and a woman on a beach

Page 4: Content-based Image Retrieval (CBIR)

4

Some Systems You Can Try

Corbis Stock Photography and Pictures

http://www.corbis.com/

• Corbis sells high-quality images for use in advertising, marketing, illustrating, etc.

• Search is entirely by keywords.

• Human indexers look at each new image and enter keywords.

• A thesaurus constructed from user queries is used.

Page 5: Content-based Image Retrieval (CBIR)

5

AVAİLABLE CBIR SYSTEMS

Page 6: Content-based Image Retrieval (CBIR)

6

QBICIBM’s QBIC (Query by Image Content)

http://wwwqbic.almaden.ibm.com

• The first commercial system.

• Uses or has-used color percentages, color layout, texture, shape, location, and keywords.

Page 7: Content-based Image Retrieval (CBIR)

7

Blobworld

UC Berkeley’s Blobworld

http://elib.cs.berkeley.edu/photos/blobworld

•Images are segmented on color plus texture

• User selects a region of the query image

• System returns images with similar regions

• Works really well for tigers and zebras

Page 8: Content-based Image Retrieval (CBIR)

8

DittoDitto: See the Web

http://www.ditto.com

• Small company

• Allows you to search for pictures from web pages

Page 9: Content-based Image Retrieval (CBIR)

National Archives, USA

9

Declaration of Independence

Page 10: Content-based Image Retrieval (CBIR)

Ottoman Archives ???

10

Page 11: Content-based Image Retrieval (CBIR)

QUERY BY EXAMPLE

11

Page 12: Content-based Image Retrieval (CBIR)

12

Image Features / Distance Measures

Image Database

Query Image

Distance Measure

Retrieved Images

Image Feature

Extraction

User

Feature SpaceImages

Page 13: Content-based Image Retrieval (CBIR)

13

Features

• Color (histograms, gridded layout, wavelets)

• Texture (Laws, Gabor filters, local binary partition)

• Shape (first segment the image, then use statistical or structural shape similarity measures)

• Objects and their Relationships

This is the most powerful, but you have to be able to recognize the objects!

Page 14: Content-based Image Retrieval (CBIR)

14

Color Histograms

Page 15: Content-based Image Retrieval (CBIR)

15

QBIC’s Histogram Similarity

The QBIC color histogram distance is:

dhist(I,Q) = (h(I) - h(Q)) A (h(I) - h(Q))T

• h(I) is a K-bin histogram of a database image

• h(Q) is a K-bin histogram of the query image

• A is a K x K similarity matrix

Page 16: Content-based Image Retrieval (CBIR)

16

Similarity Matrix

R G B Y C V 1 0 0 .5 0 .50 1 0 .5 .5 00 0 1 1 1 1

RGBYCV

How similar is blue to cyan?

??

Page 17: Content-based Image Retrieval (CBIR)

17

Gridded Color

Gridded color distance is the sum of the color distancesin each of the corresponding grid squares.

What color distance would you use for a pair of grid squares?

1 12 2

3 34 4

Page 18: Content-based Image Retrieval (CBIR)

18

Texture Distances

• Pick and Click (user clicks on a pixel and system retrieves images that have in them a region with similar texture to the region surrounding it.

• Gridded (just like gridded color, but use texture).

• Histogram-based (e.g. compare the LBP histograms).

Page 19: Content-based Image Retrieval (CBIR)

19

Local Binary Pattern Measure

100 101 103 40 50 80 50 60 90

• For each pixel p, create an 8-bit number b1 b2 b3 b4 b5 b6 b7 b8, where bi = 0 if neighbor i has value less than or equal to p’s value and 1 otherwise.

• Represent the texture in the image (or a region) by the histogram of these numbers.• Compute L1 distance between two histograms:

1 1 1 1 1 1 0 0

1 2 3

4

5 7 6

8

Page 20: Content-based Image Retrieval (CBIR)

20

Laws Texture

Page 21: Content-based Image Retrieval (CBIR)

21

Law’s texture masks (1)

Page 22: Content-based Image Retrieval (CBIR)

22

Shape Distances

• Shape goes one step further than color and texture.

• It requires identification of regions to compare.

• There have been many shape similarity measures suggested for pattern recognition that can be used to construct shape distance measures.

Page 23: Content-based Image Retrieval (CBIR)

23

Global Shape Properties:Projection Matching

041320

0 4 3 2 1 0

In projection matching, the horizontal and verticalprojections form a histogram.

Feature Vector(0,4,1,3,2,0,0,4,3,2,1,0)

What are the weaknesses of this method? strengths?

Page 24: Content-based Image Retrieval (CBIR)

24

Global Shape Properties:Tangent-Angle Histograms

135

0 30 45 135

Is this feature invariant to starting point?

Page 25: Content-based Image Retrieval (CBIR)

Boundary Matching:Fourier Descriptors

Given a sequence of points of a boundary Vk

25

Page 26: Content-based Image Retrieval (CBIR)

26

Boundary Matching

• Elastic Matching

The distance between query shape and image shapehas two components:

1. energy required to deform the query shape into one that best matches the image shape

2. a measure of how well the deformed query matches the image

Page 27: Content-based Image Retrieval (CBIR)

27

Del Bimbo Elastic Shape Matching

query retrieved images

Page 28: Content-based Image Retrieval (CBIR)

28

Our work on CBIR at METU1. M. Uysal, F.T. Yarman-Vural, “A Content-Based Fuzzy Image Database Based on The Fuzzy ARTMAP

Architecture”, Turkish Journal of Electrical Engineering & Computer Sciences, vol. 13, pp.333-342, 2005.

2. Ö.Ö. Tola, N. Arıca, F.T. Yarman-Vural, “Shape Recognition with Generalized Beam Angle Statistics”, Lecture Notes on Computer Science, vol. 2, 2004.

3. Ö.C. Özcanli, F.T. Yarman-Vural, “An Image Retrieval System Based on Region Classification”, Lecture Notes on Computer Science, vol. 1, 2004.

4. Y.Xu, P.Duygulu, E.Saber, A.M. Tekalp, F.T. Yarman-Vural, “Object-based Image Labeling Through Learning by Example and Multi-level Segmentation”, Pattern Recognition, vol. 36, pp.1407-23, 2003.

5. N. Arıca, F.T. Yarman-Vural, “BAS: A Perceptural Shape Descriptor Based on the Beam Angle Statistics”, Pattern Recognition Letters, vol. 24, pp.1627-39, 2003.

6. M. Ozay, F.T. Yarman-Vural. " On the Performance of Stacked Generalization Architecture", Lecture Notes in Computer Science, ICIAR pp.445-451, 2008,

7. E. Akbaş, F.T. Yarman-Vural, "Automatic Image Annotation by Ensemble of Visual Descriptors", CVPR 2007,pp.1-8, 2007.

8. İ. Yalabık, F.T. Yarman-Vural, G. Ucoluk, O.T. Sehitoglu, "A Pattern Classification Approach for Boosting with Genetic Algorithms", ISCIS07, pp.1-6, 2007.

9. M. Uysal, E. Akbas, F.T. Yarman-Vural, “A Hierarchical Classification System Based on Adaptive Resonance Theory”, ICIP06, p.2913-16, 2006.

10. E. Akbaş, F.T. Yarman-Vural, “Design of Feature Set for Face Recognition Problem”, Lecture Notes in Computer Science, ISCIS 2006, pp.239-47, 2006.

11. N. Arica, F.T. Yarman-Vural, “Shape Similarity Measurement for Boundary Based Features”, Lecture Notes in Computer Science, ISBN 3-540-29069-9, 3656, pp.431-438, 2005.

12. M. Uysal and F.T. Yarman-Vural, “ORF-NT: An Object-Based Image Retrieval Framework Using Neighborhood Trees”, Lecture Notes in Computer Science 37332005, ISBN 3-540-29414-7 SCIS2005, p.595-605, 2005.

Page 29: Content-based Image Retrieval (CBIR)

13. N. Arıca and F.T. Yarman-Vural, “A Compact Shape Descriptor Based on the Beam Angle Statistics”, International Conference on Image and Vidor Retrieval (CIVR), 2003.

14. N. Arıca, F.T. Yarman-Vural, “A Perceptual Shape Descriptor”, International Conference on Pattern Recognition ICPR, 2002.

15. A. Çarkacıoğlu, F.T. Yarman-Vural "Learning Similarity Space", Proc. Int. Conf. on Processing ICIP02, pp. 405-408, Rochester, NY, USA, September 2002.

16. N. Arıca, F.T. Yarman-Vural, “A Perceptual Shape Descriptor”, Proc. International Conference on Pattern Recognition (ICPR) Quebec, Canada, 2002.

17. N. Arıca, F.T. Yarman-Vural, “A Shape Descriptor Based on Circular Hidden Markov Model”, Proc. International Conference on Pattern Recognition (ICPR), 2000, Barcelona, Spain.

18. P. Duygulu, E. Saber, M. Tekalp, F.T. Yarman-Vural, “Multi-Level Image Segmentation for Content Based Image Representation" IEEE-ICASSP-2000, July 2000, Istanbul.

19. P. Duygulu, A. Çarkacıoğlu, F.T. Yarman-Vural, "Multi-Level Object Description: Color or Texture", First IEEE Balkan Conference on Signal Processing, Communications, Circuits, and Systems, June 1-3, 2000, Istanbul, Turkey.

20. N. Dicle, F.T. Yarman-Vural, NES A File Format for Content Based Coding of Images, ISCIS 15, 2000, Istanbul.

21. N. Arıca, F.T. Yarman-Vural, “A New HMM Topology for Shape Recognition” IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing (NSIP'99), pp. 162-168, Antalya TURKEY, 1999.

22. S. Genc, F.T. Yarman-Vural, "Morphing for Motion Estimation" Int.Conf. on Image Processing Systems, and Technologiesand Information Systems, Las Vegas 1999.

29

Page 30: Content-based Image Retrieval (CBIR)

Türkçe Bildiriler 1. M. Özay, F.T. Yarman Vural, "Yığılmış Genelleme Algoritmalarının Performans Analizi", Sinyal

İşleme ve İletişim Uygulamaları Kurultayı, 2008.2. A. Sayar, F.T. Yarman Vural, "Yarı Denetimli Kümeleme ile Görüntü İçeriğine Açıklayıcı Not

Üretme", Sinyal İşleme ve İletişim Uygulamaları Kurultayı, 2008.3. Ö. Özcanli, E. Akbaş, F.T. Yarman Vural, “Görüntü Erişiminde Bulanık ARTMAP Ve Adaboost

Sınıflandırma Yöntemlerinin Performans Karşılaştırması”, IEEE 13. Sinyal İşleme ve İletişim Uygulamaları Kurultayı, 2005.

4. Ö.Ö. Tola, N. Arıca, F.T. Yarman Vural, “Genelleştirilmiş Kerteriz Açısı İstatistikleri ile Şekil Tanıma”, Sinyal İşleme ve İletişim Uygulamaları Kurultayı, 1, 2004.

5. M. Uysal, Ö. Özcanlı, F.T. Yarman Vural, “Bulanık ARTMAP Mimarisini Kullanan İçerik Bazlı Bir İmge Sorgulama Sistemi, Sinyal Isleme ve İletisim Uygulamaları Kurultayı, 2004.

6. A. Çarkacıoğlu, F.T. Yarman Vural, "Doku Benzerliğini Tesbit Etmek İçin Yeni Bir Tanımlayıcı", 11. SIU, Sinyal İşleme ve İletişim Uygulamaları Kurultayı, 2003.

7. N. Arıca, F.T. Yarman Vural, "Tıkız Şekil Betimleyicileri", 11.SIU, Sinyal İşleme ve İletişim Uygulamaları Kurultayı, 2003.

8. Ö.C. Özcanlı, P. Duygulu Şahin, F.T. Yarman Vural, “Açıklamalı Görüntü Veritabanları Kullanarak Nesne Tanıma ve Erişimi”, 11.SIU, Sinyal İşleme ve İletişim Uygulamaları Kurultayı, 2003.

9. S. Alkan, F.T. Yarman Vural, “Öğretim Üyesi Yetiştirme Programı”, Elektrik Elektronik, Bilgisayar Mühendislikleri Eğitimi I. Ulusal Sempozyumu, 2003.

10. M. Uysal, F.T. Yarman Vural, “En İyi Temsil Eden Öznitelik Kullanılarak İçeriğe Dayalı Endeksleme ve Bulanık Mantığa Dayalı Sorgulama Sistemi”, 11.SIU, Sinyal İşleme ve İletişim Uygulamaları Kurultayı, 2003.

11. N. Arıca, F.T. Yarman Vural, “Kerteriz Tabanlı Şekil Tanımlayıcısı”, 10. SIU, cilt.1 sayfa 129-134, 2002.

12. Ö.C. Özcanlı, S. Dağlar, F.T. Yarman Vural, “Yerel Benzerlik Örüntüsü Yöntemi ile Görüntü Erişim Sistemi”, 10. SIU, cilt.1 sayfa 141-146, 2002. 30

Page 31: Content-based Image Retrieval (CBIR)

BAS: Beam Angle Statstics

Ömer Önder TolaNafiz Arıca

Fatoş Tünay Yarman Vural

Page 32: Content-based Image Retrieval (CBIR)

Beam Angle Statistics (BAS)

BAS, is a shape descriptor that describes a shape that is defined as an ordered set of boundary pixels, by assigning the statistics of beam angles measured for the pixel as a feature vector.

Beam Angle shows how much the shape is bended.

BAS, describes the 2D shape as 1D moment functions of the Beam Angles.

Page 33: Content-based Image Retrieval (CBIR)

Beam Angle Statistics: Given an ordered set of boundary pixels

1 2 34

56

78

91011

121314

15161718192021222324

2526

2728

2930

313233

34

35 36

Page 34: Content-based Image Retrieval (CBIR)

Beam Angle Statistics)(ip 0 1 23

45

67

8910

111213

14151617181716151413

1211

109

87

654

3

2 1

))(( ipC1

Page 35: Content-based Image Retrieval (CBIR)

Beam Angle Statistics)(ip 0 1 23

45

67

8910

111213

14151617181716151413

1211

109

87

654

3

2 1

))(( ipC2

Page 36: Content-based Image Retrieval (CBIR)

Beam Angle Statistics)(ip 0 1 23

45

67

8910

111213

14151617181716151413

1211

109

87

654

3

2 1

))(( ipC3

Page 37: Content-based Image Retrieval (CBIR)

Beam Angle Statistics)(ip

))(( ipCK

0 1 23

45

67

8910

111213

14151617181716151413

1211

109

87

654

3

2 1

Page 38: Content-based Image Retrieval (CBIR)

Beam Angle Statistics)(ip 0 1 23

45

67

8910

111213

14151617181716151413

1211

109

87

654

3

2 1

))(( ipC17

Page 39: Content-based Image Retrieval (CBIR)

Beam Angle Statistics)(ip

)( Kip

)( Kip

))(( ipCK

Page 40: Content-based Image Retrieval (CBIR)

Beam Angle Statistics)(ip

)( Kip

)( Kip

,...)(,)()( 21 iCEiCEi

0,1,2,...m ))(()(K

iCPCiCE KKmK

m

Page 41: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 41

Plots of CK(i)’s with fix K values

k=N/40

k=N/10

k=N/4

k=N/40 : N/4

Page 42: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 42

What is the most appropriate value for K which discriminates the shapes in large database and represents the shape information at all scale ?

Answer:Find a representation which employs the

information in CK(i) for all values of K.

Output of a stochastic process at each point

Page 43: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 43

C(i) is a Random Variable of the stochastic process which generates the beam angles

mth moment of random variable C(i)

Each boundary point i is represented by the moments of C(i)

Page 44: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 44

First three moments of C(i)’s

Page 45: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 45

Correspondence of Visual Parts and Insensitivity to Affine Transformation

Page 46: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 46

Robustness to Polygonal Approximation

Robustness to Robustness to NoiseNoise

Page 47: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 47

Similarity MeasurementElastic Matching Algorithm

Similarity Measurement method Application of dynamic programming Minimize the distance between two

patterns by allowing deformations on the patterns.

Cost of matching two items is calculated by Euclidean metric.

Robust to distortions promises to approximate human ways of

perceiving similarity

Page 48: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 48

TEST RESULT FOR MPEG 7 CE PART A-1 Robustness to Scaling

Page 49: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 49

TEST RESULT FOR MPEG 7 CE PART A-2 Robustness to Rotation

Page 50: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 50

TEST RESULT FOR MPEG 7 CE PART B Similarity-based Retrieval

Page 51: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 51

TEST RESULT FOR MPEG 7 CE

PART C Motion and Non-Rigid Deformations

Page 52: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 52

Comparison Best Studies in MPEG 7 CE Shape

1;Data Set Shape

Context

Tangent Space

Curvature Scale Space

Zernika

Moments

Wavelet DAG SBA with length 40

SBA with length 60

Part A1 _ 88.6589.76 92.54 88.04 85 89.32 90.87

Part A2 _ 100 99.37 99.60 97.46 85 99.82 100

Part B 76.51 76.4575.44 70.22 67.76 60 81.04 82.37

Part C _ 92 96 94.5 93 83 93 93.5

Page 53: Content-based Image Retrieval (CBIR)

METU, Department of Computer Engineering 53

Performance Evaluation (1) Average performance with the average over the three parts; Total Score1 = 1/3 A + 1/3 B + 1/3 C

Tangent Space

Curvature Scale Space

Zernika

Moments

Wavelet DAG SBA with length 40

SBA with length 60

Total Score 1

87.59 88.67 86.93 84.50 76 90.80 91.69

(2) Average performance with the average over the number of queries;Total Score2 = 840/2241 A + 1400/2241 B+ 1 / 2241 C

Tangent Space

Curvature Scale Space

Zernika

Moments

Wavelet DAG SBA with length 40

SBA with length 60

Total Score 2

83.16 82.62 79.92 77.14 69.38

86.12 87.27

Page 54: Content-based Image Retrieval (CBIR)

Boundary of a shape is not always available

Question: Can we somehow extract the boundary of a shape directly on the edge detected image?

GENERALIZE THE BAS FUNCTIONS

Page 55: Content-based Image Retrieval (CBIR)

Generalized Beam Angle Statistics (GBAS)

Assume: Shape is embedded in set of unordered edge pixel.

Define the generalized beam angles formed by the forward and backward boundary pixel sets that are partitioned by the mean beam vector.

Page 56: Content-based Image Retrieval (CBIR)

Generalized Beam Angle Statistics (GBAS)

1 2

3

4

5

6

7

8

9

9 10

11

1213

1415

16

17

1718

19

20 2122

23

24

2526

2728

2930

31 32

33

34

Page 57: Content-based Image Retrieval (CBIR)

Mean Beam Vector

Mean Beam Vector

)(ip

Page 58: Content-based Image Retrieval (CBIR)

Forward and BackwardBoundary Pixel Sets

SgggG ,...,, 21 RiiiI ,...,, 21

ForwardBoundary Pixels

Set

BackwardBoundary Pixels

Set

)(ip

Mean Beam Vector

Page 59: Content-based Image Retrieval (CBIR)

Generalized Beam Angle

li

kg

)(ip

))((, ipC lk

SgggG ,...,, 21 RiiiI ,...,, 21

ForwardBoundary Pixels

Set

BackwardBoundary Pixels

Set

Mean Beam Vector

Page 60: Content-based Image Retrieval (CBIR)

Generalized Beam Angle

li

kg

)(ip

))((, ipC lk

,...)(,)()( 21 iCEiCEi

0,1,2,...m ))(()( ,,

S

k

R

llki

mlk

m iCPCiCE1 1

Mean Beam Vector

Page 61: Content-based Image Retrieval (CBIR)

Generalized Beam Angle Statistics

Page 62: Content-based Image Retrieval (CBIR)

HANOLISTIC: a hierarchical automatic image annotation system using holistic approach

Özge Öztimur Karadağ &

Fatoş T. Yarman Vural

Department of Computer EngineeringMiddle East Technical University, Ankara, Turkey

Page 63: Content-based Image Retrieval (CBIR)

Automatic Image Annotation

Image Annotation : Assigning keywords to digital images. Labor intensive Time consuming

Need a system that automatically annotates images.

Page 64: Content-based Image Retrieval (CBIR)

Image Annotation Literature

Annotation problem has become popular since 1990s.

Related to CBIR. CBIR processes visual information Annotation processes visual and

semantic information Relate visual content information

to semantic context information.

Page 65: Content-based Image Retrieval (CBIR)

Problems About Automatic Image Annotation Human subjectivity Semantic Gap Availability of datasets

Page 66: Content-based Image Retrieval (CBIR)

Image Annotation Approaches in the Literature…

Segmental Approaches1. Segment or partition the image into

regions2. Extract features from the regions3. Quantize features into blobs4. Model the relation between the

image regions and annotation words

Holistic Approaches Features are extracted from the

whole image.

Page 67: Content-based Image Retrieval (CBIR)

The Proposed System: HANOLISTIC

Introducing semantic information as supervision. each word is considered as a class label, an image belongs to one or more classes

Holistic Approach: multiple visual features are extracted from the several whole image. Multiple feature spaces

Page 68: Content-based Image Retrieval (CBIR)

Description of an Image Content Description by Visual Features

of Mpeg-7 Color Layout Color Structure Scalable Color Homogenous Texture Edge Histogram

Context Description by Semantic Words Annotation words

Page 69: Content-based Image Retrieval (CBIR)

System Architecture of HANOLISTIC

Level-0 : consists of level-0 annotators, one annotator for each visual description space.

Meta-level : consists of a meta-annotator

Page 70: Content-based Image Retrieval (CBIR)

Level-0 Annotator

refers to the features of the i th image in the j th description space

refers to the membership value of the l th word for the i th image in the j th description space.

Page 71: Content-based Image Retrieval (CBIR)

Meta-Level The results of level-0 annotators are

aggregated.

is a vector, referring to the final word

membership values for the i th image.

Page 72: Content-based Image Retrieval (CBIR)

Experimental studies

Realization of HANOLISTIC Instance based realization of Level-0 Eager realization of Level-0 Realization of Meta-level

Performance criteria

Results

Page 73: Content-based Image Retrieval (CBIR)

Experimental Setup Data set: A subset of Corel Stock

Photo Collection, consisting of 5000 images. Training set: 4500 images (500

images for validation) Testing set: 500 images

Each image is annotated with 1-5 many words.

Page 74: Content-based Image Retrieval (CBIR)

Instance based Realization of Level-0 Annotator by Fuzzy-knn

Level-0 annotators are realized by fuzzy-knn.

For each description space; k nearest neighbors of the image is determined.

Word membership values are estimated considering the neighbors’ words and their distance from the image.

High membership values are assigned to words that appear in close neighborhood.

Page 75: Content-based Image Retrieval (CBIR)

Eager Realization of Level-0 by Ann

For a given image Ii, ANN receives visual description of the image as input and semantic annotation words of the image as target.

Each ANN is trained with backpropagation and a randomly selected set of images is used for validation to determine when to stop training. K-fold cross validation is applied.

Page 76: Content-based Image Retrieval (CBIR)

Realization of Meta-Level by Majority Voting

Adds the membership values returned by level-0 annotators using the formula

where, Pi,j is a vector containing the word membership values returned from the jth level-0 annotator.

For each word select the maximum of the five word membership values estimated by the level-0 annotators.

Page 77: Content-based Image Retrieval (CBIR)

Performance Criteria Precision

Recall

F-score

Page 78: Content-based Image Retrieval (CBIR)

Performance of Level-0 Annotators Performance of Level-0

annotators with fuzzy-knn

Page 79: Content-based Image Retrieval (CBIR)

Performance of HANOLISTIC

Comparison of HANOLISTIC with other systems in the literature:

Page 80: Content-based Image Retrieval (CBIR)

Annotation Examples

Page 81: Content-based Image Retrieval (CBIR)

Annotation Examples…

Page 82: Content-based Image Retrieval (CBIR)

Conclusion We proposed a hierarchical

automatic image annotation system using holistic approach.

We tested the system both with an instance based and an eager method.

We realized that the instance based methods are more promising in the considered problem domain.

Page 83: Content-based Image Retrieval (CBIR)

Conclusion… The power of the proposed

system comes from the following main principles: Simplicity Fuzziness Simultaneous processing of content

and context information Holistic view of image through

different perspectives

Page 84: Content-based Image Retrieval (CBIR)

84

Regions and Relationships

• Segment the image into regions

• Find their properties and interrelationships

• Construct a graph representation with nodes for regions and edges for spatial relationships

• Use graph matching to compare images

Like what?

Page 85: Content-based Image Retrieval (CBIR)

85

Tiger Image as a Graph

sky

sand

tiger grass

aboveadjacent

above

inside

above aboveadjacent

image

abstract regions

Page 86: Content-based Image Retrieval (CBIR)

86

Object Detection: Rowley’s Face Finder

1. convert to gray scale2. normalize for lighting*

3. histogram equalization4. apply K-NN trained on 16K images

What data is fed tothe classifier?

32 x 32 windows ina pyramid structure

* Like first step in Laws algorithm, p. 220

Page 87: Content-based Image Retrieval (CBIR)

87

Fleck and Forsyth’s Flesh Detector

The “Finding Naked People” Paper

• Convert RGB to HSI• Use the intensity component to compute a texture map texture = med2 ( | I - med1(I) | )• If a pixel falls into either of the following ranges, it’s a potential skin pixel

texture < 5, 110 < hue < 150, 20 < saturation < 60 texture < 5, 130 < hue < 170, 30 < saturation < 130

median filters of radii 4 and 6

Look for LARGE areas that satisfy this to identify pornography.

See Transparencies

Page 88: Content-based Image Retrieval (CBIR)

88

Relevance Feedback

In real interactive CBIR systems, the user shouldbe allowed to interact with the system to “refine”the results of a query until he/she is satisfied.

Relevance feedback work has been done by a number of research groups, e.g.

• The Photobook Project (Media Lab, MIT)• The Leiden Portrait Retrieval Project• The MARS Project (Tom Huang’s group at Illinois)

Page 89: Content-based Image Retrieval (CBIR)

89

Information Retrieval Model*

An IR model consists of: a document model a query model a model for computing similarity between documents

and the queries

Term (keyword) weighting

Relevance Feedback

*from Rui, Huang, and Mehrotra’s work

Page 90: Content-based Image Retrieval (CBIR)

90

Term weighting

Term weight assigning different weights for different

keyword(terms) according their relative importance to the document

define to be the weight for term ,k=1,2,…,N, in the document i

document i can be represented as a weight vector in the term space

ikwkt

iNiii wwwD ;...;; 21

Page 91: Content-based Image Retrieval (CBIR)

91

Term weighting

The query Q also is a weight vector in the term space

The similarity between D and Q

QD

QDQDSim

),(

qNqq wwwQ ;...;; 21

.

Page 92: Content-based Image Retrieval (CBIR)

92

Using Relevance Feedback

The CBIR system should automatically adjust the weight that were given by the user for the relevance of previously retrieved documents

Most systems use a statistical method for adjusting the weights.

Page 93: Content-based Image Retrieval (CBIR)

93

The Idea of Gaussian Normalization

If all the relevant images have similar values for component j

the component j is relevant to the query If all the relevant images have very different

values for component j the component j is not relevant to the query

the inverse of the standard deviation of the related image sequence is a good measure of the weight for component j

the smaller the variance, the larger the weight

Page 94: Content-based Image Retrieval (CBIR)

94

Leiden Portrait System

The Leiden Portrait Retrieval System is anexample of the use of relevance feedback.

http://ind156b.wi.leidenuniv.nl:2000/

Page 95: Content-based Image Retrieval (CBIR)

95

Andy Berman’s FIDS System multiple distance measures Boolean and linear combinations efficient indexing using images as keys

Page 96: Content-based Image Retrieval (CBIR)

96

Andy Berman’s FIDS System:

Use of key images and the triangle inequalityfor efficient retrieval.

Page 97: Content-based Image Retrieval (CBIR)

97

Andy Berman’s FIDS System:

Bare-Bones Triangle Inequality Algorithm

Offline

1. Choose a small set of key images

2. Store distances from database images to keys

Online (given query Q)

1. Compute the distance from Q to each key

2. Obtain lower bounds on distances to database images

3. Threshold or return all images in order of lower bounds

Page 98: Content-based Image Retrieval (CBIR)

98

Andy Berman’s FIDS System:

Page 99: Content-based Image Retrieval (CBIR)

99

Andy Berman’s FIDS System:

Bare-Bones Algorithm with Multiple Distance Measures

Offline

1. Choose key images for each measure

2. Store distances from database images to keys for all measures

Online (given query Q)

1. Calculate lower bounds for each measure

2. Combine to form lower bounds for composite measures

3. Continue as in single measure algorithm

Page 100: Content-based Image Retrieval (CBIR)

100

Andy Berman’s FIDS System:

Triangle Tries

A triangle trie is a tree structure that stores the distances from database images to each of the keys, one key per tree level.

root

3 4

1 9 8

W,Z X Y

Distance to key 1

Distance to key 2

Page 101: Content-based Image Retrieval (CBIR)

101

Andy Berman’s FIDS System:

Triangle Tries and Two-Stage Pruning

• First Stage: Use a short triangle trie.

• Second Stage: Bare-bones algorithm on the images returned from the triangle-trie stage.

The quality of the output is the same as with the bare-bones algorithm itself, but execution is faster.

Page 102: Content-based Image Retrieval (CBIR)

102

Andy Berman’s FIDS System:

Page 103: Content-based Image Retrieval (CBIR)

103

Andy Berman’s FIDS System:

Page 104: Content-based Image Retrieval (CBIR)

104

Andy Berman’s FIDS System:

Performance on a Pentium Pro 200-mHz

Step 1. Extract features from query image. (.02s t .25s)

Step 2. Calculate distance from query to key images. (1s t .8ms)

Step 3. Calculate lower bound distances. (t 4ms per 1000 images using 35 keys, which is about 250,000 images per second.)

Step 4. Return the images with smallest lower bound distances.

Page 105: Content-based Image Retrieval (CBIR)

105

Andy Berman’s FIDS System:

Page 106: Content-based Image Retrieval (CBIR)

106

Weakness of Low-level Features

Can’t capture the high-level concepts

Page 107: Content-based Image Retrieval (CBIR)

107

Current Research Objective

Image Database

Query Image Retrieved Images

Images

Object-oriented Feature

Extraction

User

Animals

Buildings

Office Buildings

Houses

Transportation

•Boats

•Vehicles

boat

Categories

Page 108: Content-based Image Retrieval (CBIR)

108

Overall Approach

• Develop object recognizers for common objects

• Use these recognizers to design a new set of both

low- and high-level features

• Design a learning system that can use these

features to recognize classes of objects

Page 109: Content-based Image Retrieval (CBIR)

109

Boat Recognition

Page 110: Content-based Image Retrieval (CBIR)

110

Vehicle Recognition

Page 111: Content-based Image Retrieval (CBIR)

111

Building Recognition

Page 112: Content-based Image Retrieval (CBIR)

112

Building Features: Consistent Line Clusters (CLC)

A Consistent Line Cluster is a set of lines that are homogeneous in terms of some line features.

Color-CLC: The lines have the same color feature.

Orientation-CLC: The lines are parallel to each other or converge to a common vanishing point.

Spatially-CLC: The lines are in close proximity to each other.

Page 113: Content-based Image Retrieval (CBIR)

113

Color-CLC Color feature of lines: color pair (c1,c2) Color pair space:

RGB (2563*2563) Too big!Dominant colors (20*20)

Finding the color pairs:One line Several color pairs

Constructing Color-CLC: use clustering

Page 114: Content-based Image Retrieval (CBIR)

114

Color-CLC

Page 115: Content-based Image Retrieval (CBIR)

115

Orientation-CLC The lines in an Orientation-CLC are

parallel to each other in the 3D world

The parallel lines of an object in a 2D image can be: Parallel in 2D Converging to a vanishing point

(perspective)

Page 116: Content-based Image Retrieval (CBIR)

116

Orientation-CLC

Page 117: Content-based Image Retrieval (CBIR)

117

Spatially-CLC Vertical position clustering Horizontal position clustering

Page 118: Content-based Image Retrieval (CBIR)

118

Building Recognition by CLC

Two types of buildings Two criteria Inter-relationship criterion Intra-relationship criterion

Page 119: Content-based Image Retrieval (CBIR)

119

Inter-relationship criterion(Nc1>Ti1 or Nc2>Ti1) and (Nc1+Nc2)>Ti2

Nc1 = number of intersecting lines in cluster 1

Nc2 = number of intersecting lines in cluster 2

Page 120: Content-based Image Retrieval (CBIR)

120

Intra-relationship criterion

|So| > Tj1 or w(So) > Tj2

S0 = set of heavily overlapping lines in a cluster

Page 121: Content-based Image Retrieval (CBIR)

121

Experimental Evaluation Object Recognition

97 well-patterned buildings (bp): 97/97 44 not well-patterned buildings (bnp):

42/44 16 not patterned non-buildings (nbnp):

15/16 (one false positive) 25 patterned non-buildings (nbp): 0/25

CBIR

Page 122: Content-based Image Retrieval (CBIR)

122

Experimental Evaluation Well-Patterned Buildings

Page 123: Content-based Image Retrieval (CBIR)

123

Experimental Evaluation Non-Well-Patterned Buildings

Page 124: Content-based Image Retrieval (CBIR)

124

Experimental Evaluation Non-Well-Patterned Non-Buildings

Page 125: Content-based Image Retrieval (CBIR)

125

Experimental EvaluationWell-Patterned Non-Buildings (false positives)

Page 126: Content-based Image Retrieval (CBIR)

126

Experimental Evaluation (CBIR)

Total Positive

Classification

(#)

Total Negative

Classification(#)

False positive

(#)

False negativ

e(#)

Accuracy(%)

Arborgreens

0 47 0 0 100

Campusinfall

27 21 0 5 89.6

Cannonbeach

30 18 0 6 87.5

Yellowstone 4 44 4 0 91.7

Page 127: Content-based Image Retrieval (CBIR)

127

Experimental Evaluation (CBIR) False positives from Yellowstone