Autoregressive and Random Field

Wei-Ta Chu

Autoregressive and Random FieldTexture Models

1

2009/11/5

Multimedia Content Analysis, CSIE, CCU

Announcement of Homework #2–Content-Based Image Retrieval

2

Goal: develop a basic CBIR system or utilize an opensource library to build a CBIR system

Requirements (case 1): 1. Write programs that use at least one color-based feature

and one texture-based feature to automatically perform


and one texture-based feature to automatically performCBIR

2. Write a report that describes2.1. How to run your program2.2. What kinds of features, distance metrics, and algorithms you

used or compared.2.3. Detection performance in precision and recall, or even

ROC/PR curves


3

Requirements (case 2):1. Setup a CBIR system based on an open source

library2. Write a report that describes2.1. How to setup this system, including environment setting,


2.1. How to setup this system, including environment setting,parameter setting, and etc.

2.2. How can we write a CBIR program based on this library2.3. What kinds of features, distance metrics, and algorithms

the library used.2.4. Detection performance in precision and recall, or even

ROC/PR curves


4

Evaluation data http://www.cs.ccu.edu.tw/~wtchu/courses/2009f_MCA/assignments.html

Homework submissionPack your programs and report into one zip file, and


Pack your programs and report into one zip file, andupload to eCourse.

Deadline: 12:00, Nov. 22, 2009

Grade will be given based on retrievalperformance and descriptions in your report.

Random Field5

Think of a textured image as a 2D array of randomnumbers. The pixel intensity at each location is arandom variable.

One can model the image as a function f(r,w), where r


One can model the image as a function f(r,w), where ris the position vector representing the pixel location,and w is a random parameter.

Once we select a specific texture w, f(r,w) is an image. f(r,w) is called a random field

Random Field Model6

A typical random field model is characterized by aset of neighbors.

Given an array of observations of pixel-intensityvalues {y(s)}, it’s natural to expect that the pixel


values {y(s)}, it’s natural to expect that the pixel values are locally correlated.

Markov model

Simultaneous Autoregressive Model(SAR)

7

A special case of Markov random field


Multiresolution SAR (MRSAR)8

It’s not trivial to determine the appropriate size of the neighborhood.

The MRSAR model tries to account for the variabilityof texture primitives by defining the SAR model at


of texture primitives by defining the SAR model atdifferent resolutions.

SAR

SAR

SAR

Original image Image pyramid

Wei-Ta Chu

Spectral Texture Features9

2009/11/5


Introduction10

Any function that is periodically repeatscan be expressed as the sum of sinesand/or cosines of different frequencies,each multiplied by a differentcoefficient–Fourier series.


coefficient–Fourier series. Even functions that are not periodic can be

expressed as the integral of sines and/orcosines multiplied by a weighting function.The formulation is the Fourier transform.

Definition of the Fourier Transform11

Forward Continuous-Time Fourier Transform

Inverse Continuous-Time Fourier Transform

The forward transform is an analysis integral becauseit extracts spectrum information

The inverse transform is a synthesis integral becauseit is used to create the time-domain signal from itsspectral information.

Inverse Continuous-Time Fourier Transform

Definition of the Fourier Transform12

Time domain and frequency domain

It is common to say that we take the Fouriertransform of x(t), meaning that we determinetransform of x(t), meaning that we determineso that we can use the frequency-domainrepresentation of the signal.

We often say that we take the inverse Fouriertransform to go from the frequency-domain to thetime-domain.

Example: Forward Fourier Transform13

Consider the one-sided exponential signal

Take the Fourier transform of x(t)

Time-Domain Frequency-Domain

Rectangular Pulse Signals14

Consider the rectangular pulse

The Fourier transform is The Fourier transform is


Rectangular Pulse Signals15

The Fourier transform of the rectangular pulsesignal is called a sinc function.

The formal definition of a sincfunction isfunction is


Discrete Fourier Transform16

One-dimensional DFT

for u= 0, 1, 2, …, M-1

for x= 0, 1, 2, …, M-1


for x= 0, 1, 2, …, M-1

In order to compute F(u), we start by substituting u = 0 in the exponential termand then summing for all values of x. We then substitute u= 1 …Like f(x), the transform is a discrete quantity, and it has the same number ofcomponents as f(x).


Euler’s formula:


Each term of the Fourier transform (the value of F(u)) is composed of the sum of allvalues of the function f(x).The domain (values of u) over which the values of F(u) range is called thefrequency domain, because u determines the frequency of the components of thetransform. Each of the M terms of F(u) is called a frequency component of thetransform.


Express F(u) in polar coordinates:

Magnitude or spectrum

Phase angle or phase spectrum


Two-dimensional DFT


Images in Frequency Domain20


Gonzalez and Woods, Chapter 4 of Digital Image Processing, Prentice-Hall, 2001.

Images and Their FT21


Frequency Domain Features22

Fourier domain energy distribution Angular features (directionality)

u

v


Radial features (coarseness)

Uniform division may not be the best

u

v

Gabor Texture23

The Gabor representation has been shown to beoptimal in the sense of minimizing the joint two-dimensional uncertainty in space and frequency.

These filters can be considered as orientation and These filters can be considered as orientation andscale tunable edge and line (bar) detectors.

The statistics of these microfeatures in a givenregion are often used to characterize theunderlying texture information.

B.S. Manjunathand W.Y. Ma, “Texture features for browsing and retrieval of image data,” IEEE Trans. on PAMI, vol. 18, no. 8, 1996, pp. 837-842.

Gabor Texture24

Fourier coefficients depend on the entire image (Global) →we lose spatial information

Objective: local spatial frequency analysis Gabor kernels: looks like Fourier basis multiplied by a

Gaussian


Gaussian Gabor filters come in pairs: symmetric and anti-symmetric

We need to apply a number of Gabor filters at differentscales, orientations, and spatial frequencies

Symmetric kernel

Anti-symmetric kernel

Gabor Texture25

Image I(x,y) convoluted with Gabor filters hmn (totally M x N)

Using first and 2nd moments for each scale and orientations


Features: e.g., 4 scales, 6 orientations→ 48 dimensions

evenodd

Gabor Texture26

scale


Arranging the mean energy in a 2D form structured: localized pattern oriented (or directional): column pattern granular: row pattern random: random pattern

orientation

Homogeneous Texture Descriptor27

Frequency plane partition is uniform along the angular direction (30º), non-uniform alongthe radial direction (on an octave scale)


Gabor Function28

On the top of the feature channel, the following 2D Gaborfunction (modulated Gaussian) is applied to each individualchannels.

Equivalent to weighting the Fourier transform coefficients of the Equivalent to weighting the Fourier transform coefficients of theimage with a Gaussian centered at the frequency channels asdefined above

Each channel filters a specific type of texture

Homogeneous Texture Descriptor29

Partition the frequency domain into 30 channels(modeled by a 2D Gabor function)

Computing the energy and energy deviation foreach channel


each channel Computing the mean and standard deviation of

frequency coefficients HTD = {fDC, fSD, e1,e2,…,e30,d1,d2,…,d30}

fDC and fSD are the mean and standard deviation of the imageei and di are the mean energy and energy deviation of the corresponding ith channel

Distance Measure30

Resources: http://vision.ece.ucsb.edu/texture/feature.htmlOn-line demo: http://vision.ece.ucsb.edu/texture/mpeg7/index.html


Example: Browsing Satellite Images31

Find a vegetation patch that looks like this region



Example: Browsing Satellite Images32

(b) parts of highway (c) region containing some buildings (center of the image

toward the left) (d) a number marked on the image (lower left corner)


Wavelet Features33

Wavelet transforms refer to the decomposition of a signal witha family of basis functions with recursive filtering andsubsampling

At each level, it decomposes a 2D signal into four subbands,which are often referred to as LL, LH, HL, HH (L=low, H=high)which are often referred to as LL, LH, HL, HH (L=low, H=high)

LL2 HL2HL1

LH2 HH2

LH1 HH1

Wavelet Features34

Using the mean and standard deviation of the energydistribution in each subband at each level.

PWT (Pyramid-structured wavelet transform) Recursively decompose the LL band Results in 30-dimensional feature vector (3x3x2+2=30) Results in 30-dimensional feature vector (3x3x2+2=30)

TWT (Tree-structured wavelet transform) Some information appears in the middle frequency channels–

decomposition is not restricted to the LL band Results in 40x2 = 80 dimensional feature vector

Original image PWT TWT

T. Chang and C.C.J. Kuo, “Texture analysis and classification with tree-structure wavelet transform,” IEEE Trans. On Image Processing, vol. 2, no. 4, 1993, pp. 429-441.

Wei-Ta Chu

Edge Histogram Descriptor35

2009/11/5


Park, et al. “Efficient use of local edge histogram descriptor,” Proc. of ACM International Workshop on Standards, Interoperability and Practices, pp. 51-54, 2000.

Introduction36

Spatial distribution of edges Edge histogram descriptor (EHD)

Dividing the image into 4x4 subimages, and generatethe edge histogram based on the edges in thethe edge histogram based on the edges in thesubimages. Edges are categorized into five types: vertical, horizontal,

45º diagonal, 135º diagonal, and nondirectional edges. A total of 5x16=80 histogram bins

Local Edge Histogram37

Global, Semi-global, and LocalHistograms

38

Global-edge histogram Accumulate five types of edge distributions for all subimages

Semiglobal-edge histogram


Image Matching39

Combining the local, the semiglobal, and global histogramtogether.

Total of 150 bins 80 bins (local) + 5 bins (global) + 65 bins (13x5, semiglobal)

The L distance measure D(A,B) can be:


The L1 distance measure D(A,B) can be:

This feature is one of the MPEG-7 texture descriptors.

Performance Comparison40

Retrieval performance of different texture features for the Corel photo databases.

L1 distance is used to computing the dissimilarity between images.

For the MRSAR, Mahalanobis distance is used.

MRSAR (M)#relevant images

GaborTWTPWT

MRSAR

Tamura (improved)

Coarseness histogramDirectionalityEdge histogramTamura (traditional)

#top matches considered

Manjunath and Ma, Chapter12 of Image Database:Search and Retrieval of DigitalImagery, edited by V. Castelliand L.D. Bergman, John Wiley& Sons, 2002.

Performance Comparison41

Retrieval performance of different texture featuresfor the Brodatz texture image set.

GaborPercentage ofretrieving all MRSAR (M)

Gabor

TWTPWT

MRSARTamura (improved)

Coarseness histogramDirectionalityEdge histogram

Tamura (traditional)

#top matches considered

retrieving allcorrect patterns

Wei-Ta Chu

Shape for CBIR42

2009/11/5


Shape Features43

MPEG-7 provides contour-based shape and region-based shape tools.

region-basedsimilarity


contour-basedsimilarity

similarity

Bober, “MPEG-7 visual shapedescriptors”, IEEE Trans. On CSVT, vol. 11, no. 6, pp. 716-719, 2001.

Region-Based Shape Descriptor44

The region-based SD expressed pixel distributionwithin a 2D object or region.

It can describe complex objects consisting ofmultiple disconnected regions.


multiple disconnected regions. 2D Angular Radial Transformation (ART)

Gives a compact and efficient way of describingmultiple disjoint regions

Robust to segmentation noise

Angular Radical Transform (ART)45

For each image, a set of ART coefficients Fnm is extracted:


•The MPEG-7 Visual Part of the XM 4.0, ISO/IECMPEG99/W3068, Dec. 1999.•W.-Y. Kim and Y.-S. Kim, “A New Region-BasedShape Descriptor,” ISO/IEC MPEG99/M5472, Maui, Hawaii, Dec. 1999.

Contour-Based Shape Descriptor46

The contour SD is based on theCurvature Scale-Space (CSS)representation of the contour. Distinguish between shapes that have similar

region-based shape (b)


Support search for shapes that aresemantically similar, even significant intra-class variability (c)

Robust to significant nonrigid deformations (d) and to perspective transformation (e)

Curvature Scale-Space (CSS)47

When comparing shapes, humans tend todecompose shape contours into concave and convexsections.Features: How prominent they are, their length relative


Features: How prominent they are, their length relativeto the contour length, and their position and order onthe contour

CSS representation decomposes the contour into convexand concave sections by determining the reflectionpoints (points at which curvature is zero)


CSS image shows how the inflection points change whenfiltering is applied to the contour X-axis corresponds to the position on the contour (clockwise, starting

from any arbitrary point) Y-axis corresponds to the values of a shape smooth parameter (when y-


Y-axis corresponds to the values of a shape smooth parameter (when y-values increase, amount of smoothing increases)

Any black point in the CSS image signifies that at the correspondingposition and at the corresponding scale, there is an inflection point.


The smoothing is performed iteratively and for each level, the zero crossings of thecurvature function are computed.

The CSS image is obtained by plotting all zero-crossing points on a plane

Mokhtarian and Mackworth, “A theory of multiscale, curvature-basedshape representation for planar curves,” IEEE Trans. on PAMI, vol. 14, no. 8, pp. 789-805, 1992.

Shape Descriptor50

Based on CSS images, the descriptor consists of Eccentricity (偏移量) and circularity (環狀) values of the

original and filtered contour Number of peaks

The magnitude (height) of the largest peak


The magnitude (height) of the largest peak The x and y positions on the remaining peaks

Chapter 15 of Introduction to MPEG-7: Multimedia ContentDescription Interface. Edited by Manjunath, et al., John Wiley & Sons,2002.

Example: The QBIC System51

Example: The QBIC System52

ColorColor histogram

TextureCoarseness, contrast, directionality


Coarseness, contrast, directionality

ShapeArea, circularity, eccentricity, major-axis direction

Fusion of multiple types of features often givesbetter performance.

References53

Tamura, et al. "Textural feature corresponding to visualperception,"IEEE Trans. on Systems, Man, and Cybernetics, vol.SMC-8, no. 6, pp. 460-473, 1978.

Park, et al. “Efficient use of local edge histogram descriptor,” Proc. of ACM International Workshop on Standards,


Proc. of ACM International Workshop on Standards,Interoperability and Practices, pp. 51-54, 2000.

Manjunath and Ma, Chapter 12 of Image Database: Searchand Retrieval of Digital Imagery, edited by V. Castelli and L.D.Bergman, John Wiley & Sons, 2002.

Bober, “MPEG-7 visual shape descriptors”, IEEE Trans. on CSVT, vol. 11, no. 6, pp. 716-719, 2001.

Next Week54

Multidimensional Indexing Techniques


Autoregressive and Random Field

Documents

Transcript of Autoregressive and Random Field