Autoregressive and Random Field

54
Wei-Ta Chu Autoregressive and Random Field Texture Models 1 2009/11/5 Multimedia Content Analysis, CSIE, CCU

Transcript of Autoregressive and Random Field

Page 1: Autoregressive and Random Field

Wei-Ta Chu

Autoregressive and Random FieldTexture Models

1

2009/11/5

Multimedia Content Analysis, CSIE, CCU

Page 2: Autoregressive and Random Field

Announcement of Homework #2–Content-Based Image Retrieval

2

Goal: develop a basic CBIR system or utilize an opensource library to build a CBIR system

Requirements (case 1): 1. Write programs that use at least one color-based feature

and one texture-based feature to automatically perform

Multimedia Content Analysis, CSIE, CCU

and one texture-based feature to automatically performCBIR

2. Write a report that describes2.1. How to run your program2.2. What kinds of features, distance metrics, and algorithms you

used or compared.2.3. Detection performance in precision and recall, or even

ROC/PR curves

Page 3: Autoregressive and Random Field

Announcement of Homework #2–Content-Based Image Retrieval

3

Requirements (case 2):1. Setup a CBIR system based on an open source

library2. Write a report that describes2.1. How to setup this system, including environment setting,

Multimedia Content Analysis, CSIE, CCU

2.1. How to setup this system, including environment setting,parameter setting, and etc.

2.2. How can we write a CBIR program based on this library2.3. What kinds of features, distance metrics, and algorithms

the library used.2.4. Detection performance in precision and recall, or even

ROC/PR curves

Page 4: Autoregressive and Random Field

Announcement of Homework #2–Content-Based Image Retrieval

4

Evaluation data http://www.cs.ccu.edu.tw/~wtchu/courses/2009f_MCA/assignments.html

Homework submissionPack your programs and report into one zip file, and

Multimedia Content Analysis, CSIE, CCU

Pack your programs and report into one zip file, andupload to eCourse.

Deadline: 12:00, Nov. 22, 2009

Grade will be given based on retrievalperformance and descriptions in your report.

Page 5: Autoregressive and Random Field

Random Field5

Think of a textured image as a 2D array of randomnumbers. The pixel intensity at each location is arandom variable.

One can model the image as a function f(r,w), where r

Multimedia Content Analysis, CSIE, CCU

One can model the image as a function f(r,w), where ris the position vector representing the pixel location,and w is a random parameter.

Once we select a specific texture w, f(r,w) is an image. f(r,w) is called a random field

Page 6: Autoregressive and Random Field

Random Field Model6

A typical random field model is characterized by aset of neighbors.

Given an array of observations of pixel-intensityvalues {y(s)}, it’s natural to expect that the pixel

Multimedia Content Analysis, CSIE, CCU

values {y(s)}, it’s natural to expect that the pixel values are locally correlated.

Markov model

Page 7: Autoregressive and Random Field

Simultaneous Autoregressive Model(SAR)

7

A special case of Markov random field

Multimedia Content Analysis, CSIE, CCU

Page 8: Autoregressive and Random Field

Multiresolution SAR (MRSAR)8

It’s not trivial to determine the appropriate size of the neighborhood.

The MRSAR model tries to account for the variabilityof texture primitives by defining the SAR model at

Multimedia Content Analysis, CSIE, CCU

of texture primitives by defining the SAR model atdifferent resolutions.

SAR

SAR

SAR

Original image Image pyramid

Page 9: Autoregressive and Random Field

Wei-Ta Chu

Spectral Texture Features9

2009/11/5

Multimedia Content Analysis, CSIE, CCU

Page 10: Autoregressive and Random Field

Introduction10

Any function that is periodically repeatscan be expressed as the sum of sinesand/or cosines of different frequencies,each multiplied by a differentcoefficient–Fourier series.

Multimedia Content Analysis, CSIE, CCU

coefficient–Fourier series. Even functions that are not periodic can be

expressed as the integral of sines and/orcosines multiplied by a weighting function.The formulation is the Fourier transform.

Page 11: Autoregressive and Random Field

Definition of the Fourier Transform11

Forward Continuous-Time Fourier Transform

Inverse Continuous-Time Fourier Transform

The forward transform is an analysis integral becauseit extracts spectrum information

The inverse transform is a synthesis integral becauseit is used to create the time-domain signal from itsspectral information.

Inverse Continuous-Time Fourier Transform

Page 12: Autoregressive and Random Field

Definition of the Fourier Transform12

Time domain and frequency domain

It is common to say that we take the Fouriertransform of x(t), meaning that we determinetransform of x(t), meaning that we determineso that we can use the frequency-domainrepresentation of the signal.

We often say that we take the inverse Fouriertransform to go from the frequency-domain to thetime-domain.

Page 13: Autoregressive and Random Field

Example: Forward Fourier Transform13

Consider the one-sided exponential signal

Take the Fourier transform of x(t)

Time-Domain Frequency-Domain

Page 14: Autoregressive and Random Field

Rectangular Pulse Signals14

Consider the rectangular pulse

The Fourier transform is The Fourier transform is

Time-Domain Frequency-Domain

Page 15: Autoregressive and Random Field

Rectangular Pulse Signals15

The Fourier transform of the rectangular pulsesignal is called a sinc function.

The formal definition of a sincfunction isfunction is

Time-Domain Frequency-Domain

Page 16: Autoregressive and Random Field

Discrete Fourier Transform16

One-dimensional DFT

for u= 0, 1, 2, …, M-1

for x= 0, 1, 2, …, M-1

Multimedia Content Analysis, CSIE, CCU

for x= 0, 1, 2, …, M-1

In order to compute F(u), we start by substituting u = 0 in the exponential termand then summing for all values of x. We then substitute u= 1 …Like f(x), the transform is a discrete quantity, and it has the same number ofcomponents as f(x).

Page 17: Autoregressive and Random Field

Discrete Fourier Transform17

Euler’s formula:

Multimedia Content Analysis, CSIE, CCU

Each term of the Fourier transform (the value of F(u)) is composed of the sum of allvalues of the function f(x).The domain (values of u) over which the values of F(u) range is called thefrequency domain, because u determines the frequency of the components of thetransform. Each of the M terms of F(u) is called a frequency component of thetransform.

Page 18: Autoregressive and Random Field

Discrete Fourier Transform18

Express F(u) in polar coordinates:

Magnitude or spectrum

Phase angle or phase spectrum

Page 19: Autoregressive and Random Field

Discrete Fourier Transform19

Two-dimensional DFT

Multimedia Content Analysis, CSIE, CCU

Page 20: Autoregressive and Random Field

Images in Frequency Domain20

Multimedia Content Analysis, CSIE, CCU

Gonzalez and Woods, Chapter 4 of Digital Image Processing, Prentice-Hall, 2001.

Page 21: Autoregressive and Random Field

Images and Their FT21

Multimedia Content Analysis, CSIE, CCU

Page 22: Autoregressive and Random Field

Frequency Domain Features22

Fourier domain energy distribution Angular features (directionality)

u

v

Multimedia Content Analysis, CSIE, CCU

Radial features (coarseness)

Uniform division may not be the best

u

v

Page 23: Autoregressive and Random Field

Gabor Texture23

The Gabor representation has been shown to beoptimal in the sense of minimizing the joint two-dimensional uncertainty in space and frequency.

These filters can be considered as orientation and These filters can be considered as orientation andscale tunable edge and line (bar) detectors.

The statistics of these microfeatures in a givenregion are often used to characterize theunderlying texture information.

B.S. Manjunathand W.Y. Ma, “Texture features for browsing and retrieval of image data,” IEEE Trans. on PAMI, vol. 18, no. 8, 1996, pp. 837-842.

Page 24: Autoregressive and Random Field

Gabor Texture24

Fourier coefficients depend on the entire image (Global) →we lose spatial information

Objective: local spatial frequency analysis Gabor kernels: looks like Fourier basis multiplied by a

Gaussian

Multimedia Content Analysis, CSIE, CCU

Gaussian Gabor filters come in pairs: symmetric and anti-symmetric

We need to apply a number of Gabor filters at differentscales, orientations, and spatial frequencies

Symmetric kernel

Anti-symmetric kernel

Page 25: Autoregressive and Random Field

Gabor Texture25

Image I(x,y) convoluted with Gabor filters hmn (totally M x N)

Using first and 2nd moments for each scale and orientations

Multimedia Content Analysis, CSIE, CCU

Features: e.g., 4 scales, 6 orientations→ 48 dimensions

evenodd

Page 26: Autoregressive and Random Field

Gabor Texture26

scale

Multimedia Content Analysis, CSIE, CCU

Arranging the mean energy in a 2D form structured: localized pattern oriented (or directional): column pattern granular: row pattern random: random pattern

orientation

Page 27: Autoregressive and Random Field

Homogeneous Texture Descriptor27

Frequency plane partition is uniform along the angular direction (30º), non-uniform alongthe radial direction (on an octave scale)

B.S. Manjunathand W.Y. Ma, “Texture features for browsing and retrieval of image data,” IEEE Trans. on PAMI, vol. 18, no. 8, 1996, pp. 837-842.

Page 28: Autoregressive and Random Field

Gabor Function28

On the top of the feature channel, the following 2D Gaborfunction (modulated Gaussian) is applied to each individualchannels.

Equivalent to weighting the Fourier transform coefficients of the Equivalent to weighting the Fourier transform coefficients of theimage with a Gaussian centered at the frequency channels asdefined above

Each channel filters a specific type of texture

Page 29: Autoregressive and Random Field

Homogeneous Texture Descriptor29

Partition the frequency domain into 30 channels(modeled by a 2D Gabor function)

Computing the energy and energy deviation foreach channel

Multimedia Content Analysis, CSIE, CCU

each channel Computing the mean and standard deviation of

frequency coefficients HTD = {fDC, fSD, e1,e2,…,e30,d1,d2,…,d30}

fDC and fSD are the mean and standard deviation of the imageei and di are the mean energy and energy deviation of the corresponding ith channel

Page 30: Autoregressive and Random Field

Distance Measure30

Resources: http://vision.ece.ucsb.edu/texture/feature.htmlOn-line demo: http://vision.ece.ucsb.edu/texture/mpeg7/index.html

B.S. Manjunathand W.Y. Ma, “Texture features for browsing and retrieval of image data,” IEEE Trans. on PAMI, vol. 18, no. 8, 1996, pp. 837-842.

Page 31: Autoregressive and Random Field

Example: Browsing Satellite Images31

Find a vegetation patch that looks like this region

Multimedia Content Analysis, CSIE, CCU

B.S. Manjunathand W.Y. Ma, “Texture features for browsing and retrieval of image data,” IEEE Trans. on PAMI, vol. 18, no. 8, 1996, pp. 837-842.

Page 32: Autoregressive and Random Field

Example: Browsing Satellite Images32

(b) parts of highway (c) region containing some buildings (center of the image

toward the left) (d) a number marked on the image (lower left corner)

Multimedia Content Analysis, CSIE, CCU

Page 33: Autoregressive and Random Field

Wavelet Features33

Wavelet transforms refer to the decomposition of a signal witha family of basis functions with recursive filtering andsubsampling

At each level, it decomposes a 2D signal into four subbands,which are often referred to as LL, LH, HL, HH (L=low, H=high)which are often referred to as LL, LH, HL, HH (L=low, H=high)

LL2 HL2HL1

LH2 HH2

LH1 HH1

Page 34: Autoregressive and Random Field

Wavelet Features34

Using the mean and standard deviation of the energydistribution in each subband at each level.

PWT (Pyramid-structured wavelet transform) Recursively decompose the LL band Results in 30-dimensional feature vector (3x3x2+2=30) Results in 30-dimensional feature vector (3x3x2+2=30)

TWT (Tree-structured wavelet transform) Some information appears in the middle frequency channels–

decomposition is not restricted to the LL band Results in 40x2 = 80 dimensional feature vector

Original image PWT TWT

T. Chang and C.C.J. Kuo, “Texture analysis and classification with tree-structure wavelet transform,” IEEE Trans. On Image Processing, vol. 2, no. 4, 1993, pp. 429-441.

Page 35: Autoregressive and Random Field

Wei-Ta Chu

Edge Histogram Descriptor35

2009/11/5

Multimedia Content Analysis, CSIE, CCU

Park, et al. “Efficient use of local edge histogram descriptor,” Proc. of ACM International Workshop on Standards, Interoperability and Practices, pp. 51-54, 2000.

Page 36: Autoregressive and Random Field

Introduction36

Spatial distribution of edges Edge histogram descriptor (EHD)

Dividing the image into 4x4 subimages, and generatethe edge histogram based on the edges in thethe edge histogram based on the edges in thesubimages. Edges are categorized into five types: vertical, horizontal,

45º diagonal, 135º diagonal, and nondirectional edges. A total of 5x16=80 histogram bins

Page 37: Autoregressive and Random Field

Local Edge Histogram37

Page 38: Autoregressive and Random Field

Global, Semi-global, and LocalHistograms

38

Global-edge histogram Accumulate five types of edge distributions for all subimages

Semiglobal-edge histogram

Multimedia Content Analysis, CSIE, CCU

Page 39: Autoregressive and Random Field

Image Matching39

Combining the local, the semiglobal, and global histogramtogether.

Total of 150 bins 80 bins (local) + 5 bins (global) + 65 bins (13x5, semiglobal)

The L distance measure D(A,B) can be:

Multimedia Content Analysis, CSIE, CCU

The L1 distance measure D(A,B) can be:

This feature is one of the MPEG-7 texture descriptors.

Page 40: Autoregressive and Random Field

Performance Comparison40

Retrieval performance of different texture features for the Corel photo databases.

L1 distance is used to computing the dissimilarity between images.

For the MRSAR, Mahalanobis distance is used.

MRSAR (M)#relevant images

GaborTWTPWT

MRSAR

Tamura (improved)

Coarseness histogramDirectionalityEdge histogramTamura (traditional)

#top matches considered

Manjunath and Ma, Chapter12 of Image Database:Search and Retrieval of DigitalImagery, edited by V. Castelliand L.D. Bergman, John Wiley& Sons, 2002.

Page 41: Autoregressive and Random Field

Performance Comparison41

Retrieval performance of different texture featuresfor the Brodatz texture image set.

GaborPercentage ofretrieving all MRSAR (M)

Gabor

TWTPWT

MRSARTamura (improved)

Coarseness histogramDirectionalityEdge histogram

Tamura (traditional)

#top matches considered

retrieving allcorrect patterns

Page 42: Autoregressive and Random Field

Wei-Ta Chu

Shape for CBIR42

2009/11/5

Multimedia Content Analysis, CSIE, CCU

Page 43: Autoregressive and Random Field

Shape Features43

MPEG-7 provides contour-based shape and region-based shape tools.

region-basedsimilarity

Multimedia Content Analysis, CSIE, CCU

contour-basedsimilarity

similarity

Bober, “MPEG-7 visual shapedescriptors”, IEEE Trans. On CSVT, vol. 11, no. 6, pp. 716-719, 2001.

Page 44: Autoregressive and Random Field

Region-Based Shape Descriptor44

The region-based SD expressed pixel distributionwithin a 2D object or region.

It can describe complex objects consisting ofmultiple disconnected regions.

Multimedia Content Analysis, CSIE, CCU

multiple disconnected regions. 2D Angular Radial Transformation (ART)

Gives a compact and efficient way of describingmultiple disjoint regions

Robust to segmentation noise

Page 45: Autoregressive and Random Field

Angular Radical Transform (ART)45

For each image, a set of ART coefficients Fnm is extracted:

Multimedia Content Analysis, CSIE, CCU

•The MPEG-7 Visual Part of the XM 4.0, ISO/IECMPEG99/W3068, Dec. 1999.•W.-Y. Kim and Y.-S. Kim, “A New Region-BasedShape Descriptor,” ISO/IEC MPEG99/M5472, Maui, Hawaii, Dec. 1999.

Page 46: Autoregressive and Random Field

Contour-Based Shape Descriptor46

The contour SD is based on theCurvature Scale-Space (CSS)representation of the contour. Distinguish between shapes that have similar

region-based shape (b)

Multimedia Content Analysis, CSIE, CCU

Support search for shapes that aresemantically similar, even significant intra-class variability (c)

Robust to significant nonrigid deformations (d) and to perspective transformation (e)

Page 47: Autoregressive and Random Field

Curvature Scale-Space (CSS)47

When comparing shapes, humans tend todecompose shape contours into concave and convexsections.Features: How prominent they are, their length relative

Multimedia Content Analysis, CSIE, CCU

Features: How prominent they are, their length relativeto the contour length, and their position and order onthe contour

CSS representation decomposes the contour into convexand concave sections by determining the reflectionpoints (points at which curvature is zero)

Page 48: Autoregressive and Random Field

Curvature Scale-Space (CSS)48

CSS image shows how the inflection points change whenfiltering is applied to the contour X-axis corresponds to the position on the contour (clockwise, starting

from any arbitrary point) Y-axis corresponds to the values of a shape smooth parameter (when y-

Multimedia Content Analysis, CSIE, CCU

Y-axis corresponds to the values of a shape smooth parameter (when y-values increase, amount of smoothing increases)

Any black point in the CSS image signifies that at the correspondingposition and at the corresponding scale, there is an inflection point.

Page 49: Autoregressive and Random Field

Curvature Scale-Space (CSS)49

The smoothing is performed iteratively and for each level, the zero crossings of thecurvature function are computed.

The CSS image is obtained by plotting all zero-crossing points on a plane

Mokhtarian and Mackworth, “A theory of multiscale, curvature-basedshape representation for planar curves,” IEEE Trans. on PAMI, vol. 14, no. 8, pp. 789-805, 1992.

Page 50: Autoregressive and Random Field

Shape Descriptor50

Based on CSS images, the descriptor consists of Eccentricity (偏移量) and circularity (環狀) values of the

original and filtered contour Number of peaks

The magnitude (height) of the largest peak

Multimedia Content Analysis, CSIE, CCU

The magnitude (height) of the largest peak The x and y positions on the remaining peaks

Chapter 15 of Introduction to MPEG-7: Multimedia ContentDescription Interface. Edited by Manjunath, et al., John Wiley & Sons,2002.

Page 51: Autoregressive and Random Field

Example: The QBIC System51

Page 52: Autoregressive and Random Field

Example: The QBIC System52

ColorColor histogram

TextureCoarseness, contrast, directionality

Multimedia Content Analysis, CSIE, CCU

Coarseness, contrast, directionality

ShapeArea, circularity, eccentricity, major-axis direction

Fusion of multiple types of features often givesbetter performance.

Page 53: Autoregressive and Random Field

References53

Tamura, et al. "Textural feature corresponding to visualperception,"IEEE Trans. on Systems, Man, and Cybernetics, vol.SMC-8, no. 6, pp. 460-473, 1978.

Park, et al. “Efficient use of local edge histogram descriptor,” Proc. of ACM International Workshop on Standards,

Multimedia Content Analysis, CSIE, CCU

Proc. of ACM International Workshop on Standards,Interoperability and Practices, pp. 51-54, 2000.

Manjunath and Ma, Chapter 12 of Image Database: Searchand Retrieval of Digital Imagery, edited by V. Castelli and L.D.Bergman, John Wiley & Sons, 2002.

Bober, “MPEG-7 visual shape descriptors”, IEEE Trans. on CSVT, vol. 11, no. 6, pp. 716-719, 2001.

Page 54: Autoregressive and Random Field

Next Week54

Multidimensional Indexing Techniques

Multimedia Content Analysis, CSIE, CCU