NIPS2008: tutorial: statistical models of visual images

123
Statistical Image Models Eero Simoncelli Howard Hughes Medical Institute, Center for Neural Science, and Courant Institute of Mathematical Sciences New York University

Transcript of NIPS2008: tutorial: statistical models of visual images

Page 1: NIPS2008: tutorial: statistical models of visual images

Statistical Image Models

Eero Simoncelli

Howard Hughes Medical Institute,Center for Neural Science, and

Courant Institute of Mathematical SciencesNew York University

Page 2: NIPS2008: tutorial: statistical models of visual images

Photographic ImagesDiverse specialized structures:

• edges/lines/contours

• shadows/highlights

• smooth regions

• textured regions

Page 3: NIPS2008: tutorial: statistical models of visual images

Photographic ImagesDiverse specialized structures:

• edges/lines/contours

• shadows/highlights

• smooth regions

• textured regions

Occupy a small region of the full space

Page 4: NIPS2008: tutorial: statistical models of visual images

space of all images

typical images

One could describe this set as a deterministic manifold....

Page 5: NIPS2008: tutorial: statistical models of visual images
Page 6: NIPS2008: tutorial: statistical models of visual images
Page 7: NIPS2008: tutorial: statistical models of visual images

• Step edges are rare (lighting, junctions, texture, noise)

Page 8: NIPS2008: tutorial: statistical models of visual images

• Step edges are rare (lighting, junctions, texture, noise)

• One scale’s texture is another scale’s edge

Page 9: NIPS2008: tutorial: statistical models of visual images

• Step edges are rare (lighting, junctions, texture, noise)

• One scale’s texture is another scale’s edge

• Need seamless transitions from isolated features to dense textures

Page 10: NIPS2008: tutorial: statistical models of visual images

One could describe this set as a deterministic manifold....

space of all images

typical images

Page 11: NIPS2008: tutorial: statistical models of visual images

One could describe this set as a deterministic manifold....But seems more natural to use probability

space of all images

typical images

Page 12: NIPS2008: tutorial: statistical models of visual images

One could describe this set as a deterministic manifold....But seems more natural to use probability

space of all images

typical images

P(x)

Page 13: NIPS2008: tutorial: statistical models of visual images

“Applications”• Engineering: compression, denoising, restoration,

enhancement/modification, synthesis, manipulation

[Hubel ‘95]

Page 14: NIPS2008: tutorial: statistical models of visual images

“Applications”• Engineering: compression, denoising, restoration,

enhancement/modification, synthesis, manipulation

• Science: optimality principles for neurobiology (evolution, development, learning, adaptation)

[Hubel ‘95]

Page 15: NIPS2008: tutorial: statistical models of visual images

Density models

nonparametric parametric/constrained

Page 16: NIPS2008: tutorial: statistical models of visual images

Density models

nonparametric parametric/constrained

build a histogram from lots of observations...

Page 17: NIPS2008: tutorial: statistical models of visual images

Density models

nonparametric parametric/constrained

build a histogram from lots of observations...

use “natural constraints”(geometry/photometry

of image formation, computation, maxEnt)

Page 18: NIPS2008: tutorial: statistical models of visual images

Density models

nonparametric parametric/constrained

build a histogram from lots of observations...

use “natural constraints”(geometry/photometry

of image formation, computation, maxEnt)

historical trend(technology driven)

Page 19: NIPS2008: tutorial: statistical models of visual images

0 50 100 150 200 250

histogram

Original image

Range: [0, 237] Dims: [256, 256]

Page 20: NIPS2008: tutorial: statistical models of visual images

0 50 100 150 200 250

histogram

Original image

Range: [0, 237] Dims: [256, 256]

0 50 100 150 200 250

histogram

Equalized image

Range: [1.99, 238] Dims: [256, 256]

Page 21: NIPS2008: tutorial: statistical models of visual images

0 50 100 150 200 250

histogram

Original image

Range: [0, 237] Dims: [256, 256]

0 50 100 150 200 250

histogram

Equalized image

Range: [1.99, 238] Dims: [256, 256]

Page 22: NIPS2008: tutorial: statistical models of visual images

General methodology

Observe “interesting”Joint Statistics

Transform toOptimal Representation

Page 23: NIPS2008: tutorial: statistical models of visual images

General methodology

Observe “interesting”Joint Statistics

Transform toOptimal Representation

Page 24: NIPS2008: tutorial: statistical models of visual images

General methodology

Observe “interesting”Joint Statistics

Transform toOptimal Representation

“Onion peeling”

Page 25: NIPS2008: tutorial: statistical models of visual images

Evolution of image models

I. (1950’s): Fourier + Gaussian

II. (mid 80’s - late 90’s): Wavelets + kurtotic marginals

III. (mid 90’s - present): Wavelets + local context

• local amplitude (contrast)

• local orientation

IV. (last 5 years): Hierarchical models

Page 26: NIPS2008: tutorial: statistical models of visual images

I(x,y)

I(x+1

,y)

I(x,y)I(x

+2,y

)I(x,y)

I(x+4

,y)

10 20 30 400

1

Spatial separation (pixels)

Corre

lation

a. b.Pixel correlation

Page 27: NIPS2008: tutorial: statistical models of visual images

I(x,y)

I(x+1

,y)

I(x,y)

I(x+2

,y)

I(x,y)

I(x+4

,y)

10 20 30 400

1

Spatial separation (pixels)

Corre

lation

a. b.I(x,y)

I(x+1

,y)

I(x,y)I(x

+2,y

)I(x,y)

I(x+4

,y)

10 20 30 400

1

Spatial separation (pixels)

Corre

lation

a. b.Pixel correlation

Page 28: NIPS2008: tutorial: statistical models of visual images

Translation invariance

Assuming translation invariance,

Page 29: NIPS2008: tutorial: statistical models of visual images

Translation invariance

Assuming translation invariance,

=> covariance matrix is Toeplitz (convolutional)

Page 30: NIPS2008: tutorial: statistical models of visual images

Translation invariance

Assuming translation invariance,

=> covariance matrix is Toeplitz (convolutional)

=> eigenvectors are sinusoids

Page 31: NIPS2008: tutorial: statistical models of visual images

Translation invariance

Assuming translation invariance,

=> covariance matrix is Toeplitz (convolutional)

=> eigenvectors are sinusoids

=> can diagonalize (decorrelate) with F.T.

Page 32: NIPS2008: tutorial: statistical models of visual images

Translation invariance

Assuming translation invariance,

=> covariance matrix is Toeplitz (convolutional)

=> eigenvectors are sinusoids

=> can diagonalize (decorrelate) with F.T.

Power spectrum captures full covariance structure

Page 33: NIPS2008: tutorial: statistical models of visual images

Spectral power

Structural:

F (s!) = spF (!)

F (!) ! 1!p

[Ritterman 52; DeRiugin 56; Field 87; Tolhurst 92; Ruderman/Bialek 94; ...]

Assume scale-invariance:

then:

Page 34: NIPS2008: tutorial: statistical models of visual images

Spectral power

Structural:

F (s!) = spF (!)

F (!) ! 1!p

[Ritterman 52; DeRiugin 56; Field 87; Tolhurst 92; Ruderman/Bialek 94; ...]

Assume scale-invariance:

then:

0 1 2 30

1

2

3

4

5

6

Log10 spatialfrequency (cycles/image)

Log 10

pow

er

Empirical:

Page 35: NIPS2008: tutorial: statistical models of visual images

Principal Components Analysis (PCA) + whitening

-20 20-20

20

4-4

4

-420-20

20

-20

a. b. c.

Page 36: NIPS2008: tutorial: statistical models of visual images

PCA basis for image blocks

Page 37: NIPS2008: tutorial: statistical models of visual images

PCA basis for image blocks

PCA is not unique

Page 38: NIPS2008: tutorial: statistical models of visual images

Maximum entropy (maxEnt)

E (f(x)) = c

f(x) = x2 f(x) = |x|

The density with maximal entropy satisfying

pME(x) ! exp ("!f(x))

is of the form

Examples:

where ! depends on c

Page 39: NIPS2008: tutorial: statistical models of visual images

Model I (Fourier/Gaussian)Basis set: Image:

Coefficientdensity:

!

!

!

!

Page 40: NIPS2008: tutorial: statistical models of visual images

F -11/f2

P(c)

Gaussian model is weak

P(x)

F!1!!2

Page 41: NIPS2008: tutorial: statistical models of visual images

F -11/f2

P(c)

Gaussian model is weak

a. b.

!2F F!1

P(x)

F!1!!2

Page 42: NIPS2008: tutorial: statistical models of visual images

F -11/f2

P(c)

Gaussian model is weak

a. b.

!2F F!1

-20 20-20

20

4-4

4

-420-20

20

-20

a. b. c.

P(x)

F!1!!2

Page 43: NIPS2008: tutorial: statistical models of visual images

Bandpass Filter Responses

500 0 50010 -4

10 -2

100

Filter Response

Prob

abilit

yResponse histogramGaussian density

[Burt&Adelson 82; Field 87; Mallat 89; Daugman 89, ...]

Page 44: NIPS2008: tutorial: statistical models of visual images

“Independent” Components Analysis(ICA)

For Linearly Transformed Factorial (LTF) sources: guaranteed independence(with some minor caveats)

-4 4-4

420

-2020-20

20

-2020-20 -4 4

-4

4

a. b. c. d.

[Comon 94; Cardoso 96; Bell/Sejnowski 97; ...]

Page 45: NIPS2008: tutorial: statistical models of visual images

ICA on image blocks

[Olshausen/Field ’96; Bell/Sejnowski ’97][example obtained with FastICA, Hyvarinen]

Page 46: NIPS2008: tutorial: statistical models of visual images

Marginal densities

P (x) ! exp"|x/s|p

[Mallat 89; Simoncelli&Adelson 96; Moulin&Liu 99; ...]

Well-fit by a generalized Gaussian:

Wavelet coefficient value

log(P

robabili

ty)

p = 0.46

!H/H = 0.0031

Wavelet coefficient value

log(P

robabili

ty)

p = 0.58

!H/H = 0.0011

Wavelet coefficient value

log(P

robabili

ty)

p = 0.48

!H/H = 0.0014

Wavelet coefficient value

log(P

robabili

ty)

p = 0.59

!H/H = 0.0012

Fig. 4. Log histograms of a single wavelet subband of four example images (see Fig. 1 for image description). For eachhistogram, tails are truncated so as to show 99.8% of the distribution. Also shown (dashed lines) are fitted model densitiescorresponding to equation (3). Text indicates the maximum-likelihood value of p used for the fitted model density, andthe relative entropy (Kullback-Leibler divergence) of the model and histogram, as a fraction of the total entropy of thehistogram.

non-Gaussian than others. By the mid 1990s, a numberof authors had developed methods of optimizing a ba-sis of filters in order to to maximize the non-Gaussianityof the responses [e.g., 36, 4]. Often these methods oper-ate by optimizing a higher-order statistic such as kurto-sis (the fourth moment divided by the squared variance).The resulting basis sets contain oriented filters of differentsizes with frequency bandwidths of roughly one octave.Figure 5 shows an example basis set, obtained by opti-mizing kurtosis of the marginal responses to an ensembleof 12 ! 12 pixel blocks drawn from a large ensemble ofnatural images. In parallel with these statistical develop-ments, authors from a variety of communities were devel-oping multi-scale orthonormal bases for signal and imageanalysis, now generically known as “wavelets” (see chap-ter 4.2 in this volume). These provide a good approxima-tion to optimized bases such as that shown in Fig. 5.

Once we’ve transformed the image to a multi-scalewavelet representation, what statistical model can we useto characterize the the coefficients? The statistical moti-vation for the choice of basis came from the shape of themarginals, and thus it would seem natural to assume thatthe coefficients within a subband are independent andidentically distributed. With this assumption, the modelis completely determined by the marginal statistics of thecoefficients, which can be examined empirically as in theexamples of Fig. 4. For natural images, these histogramsare surprisingly well described by a two-parameter gen-eralized Gaussian (also known as a stretched, or generalizedexponential) distribution [e.g., 31, 47, 34]:

Pc(c; s, p) =exp("|c/s|p)

Z(s, p), (3)

where the normalization constant is Z(s, p) = 2 sp!( 1

p ).An exponent of p = 2 corresponds to a Gaussian den-sity, and p = 1 corresponds to the Laplacian density. In

Fig. 5. Example basis functions derived by optimizing amarginal kurtosis criterion [see 35].

5

Page 47: NIPS2008: tutorial: statistical models of visual images

Kurtosis vs. bandwidth

0 0.5 1 1.5 2 2.5 3

4

6

8

10

12

14

16

Filter Bandwidth (octaves)

Sam

ple

Kurto

sis

[after Field 87]

Note: Bandwidth matters much more than orientation[see Bethge 06]

Page 48: NIPS2008: tutorial: statistical models of visual images

Octave-bandwidth representations

Filter:

SpatialFrequencySelectivity:

Page 49: NIPS2008: tutorial: statistical models of visual images

Model II (LTF)

Basis set: Image:Coefficient

density:

!

!

!

Page 50: NIPS2008: tutorial: statistical models of visual images

LTF also a weak model...

Sample Gaussianized

Sample ICA-transformedand Gaussianized

Page 51: NIPS2008: tutorial: statistical models of visual images

Trouble in paradise

Page 52: NIPS2008: tutorial: statistical models of visual images

Trouble in paradise

• Biology: Visual system uses a cascade- Where’s the retina? The LGN?- What happens after V1? Why don’t responses get

sparser? [Baddeley etal 97; Chechik etal 06]

Page 53: NIPS2008: tutorial: statistical models of visual images

Trouble in paradise

• Biology: Visual system uses a cascade- Where’s the retina? The LGN?- What happens after V1? Why don’t responses get

sparser? [Baddeley etal 97; Chechik etal 06]

• Statistics: Images don’t obey ICA source model- Any bandpass filter gives sparse marginals [Baddeley 96]

=> Shallow optimum [Bethge 06; Lyu & Simoncelli 08]

- The responses of ICA filters are highly dependent [Wegmann & Zetzsche 90, Simoncelli 97]

Page 54: NIPS2008: tutorial: statistical models of visual images

Conditional densities

-40 0 40 50

0.2

0.6

1

-40 0 40

0.2

0.6

1

0

-40

0

40

-40 40

Linear responses are not independent, even for optimized filters!

CSH-02

[Simoncelli 97; Schwartz&Simoncelli 01]

Page 55: NIPS2008: tutorial: statistical models of visual images

[Schwartz&Simoncelli 01]

Page 56: NIPS2008: tutorial: statistical models of visual images

• Large-magnitude subband coefficients are found at neighboring positions, orientations, and scales.

Page 57: NIPS2008: tutorial: statistical models of visual images

Method 1: Conditional Gaussian

[Simoncelli 97; Buccigrossi&Simoncelli 99;see also ARCH models in econometrics!]

Modeling heteroscedasticity(i.e., variable variance)

P (xn|{xk}) ! N!

0;"

k

wnk |xk|2 + !2

#

Page 58: NIPS2008: tutorial: statistical models of visual images

Joint densitiesadjacent near far other scale other ori

!100 0 100

!150

!100

!50

0

50

100

150

!100 0 100

!150

!100

!50

0

50

100

150

!100 0 100

!150

!100

!50

0

50

100

150

!500 0 500

!150

!100

!50

0

50

100

150

!100 0 100

!150

!100

!50

0

50

100

150

!100 0 100

!150

!100

!50

0

50

100

150

!100 0 100

!150

!100

!50

0

50

100

150

!100 0 100

!150

!100

!50

0

50

100

150

!500 0 500

!150

!100

!50

0

50

100

150

!100 0 100

!150

!100

!50

0

50

100

150

Fig. 8. Empirical joint distributions of wavelet coefficients associated with different pairs of basis functions, for a singleimage of a New York City street scene (see Fig. 1 for image description). The top row shows joint distributions as contourplots, with lines drawn at equal intervals of log probability. The three leftmost examples correspond to pairs of basis func-tions at the same scale and orientation, but separated by different spatial offsets. The next corresponds to a pair at adjacentscales (but the same orientation, and nearly the same position), and the rightmost corresponds to a pair at orthogonal orien-tations (but the same scale and nearly the same position). The bottom row shows corresponding conditional distributions:brightness corresponds to frequency of occurance, except that each column has been independently rescaled to fill the fullrange of intensities.

remain. First, although the normalized coefficients arecertainly closer to a homogeneous field, the signs of thecoefficients still exhibit important structure. Second, thevariance field itself is far from homogeneous, with mostof the significant values concentrated on one-dimensionalcontours.

4 Discussion

After nearly 50 years of Fourier/Gaussian modeling, thelate 1980s and 1990s saw sudden and remarkable shift inviewpoint, arising from the confluence of (a) multi-scaleimage decompositions, (b) non-Gaussian statistical obser-vations and descriptions, and (c) variance-adaptive sta-tistical models based on hidden variables. The improve-ments in image processing applications arising from theseideas have been steady and substantial. But the completesynthesis of these ideas, and development of further re-finements are still underway.

Variants of the GSMmodel described in the previous sec-tion seem to represent the current state-of-the-art, both interms of characterizing the density of coefficients, and interms of the quality of results in image processing appli-

cations. There are several issues that seem to be of pri-mary importance in trying to extend such models. First,a number of authors have examined different methods ofdescribing regularities in the local variance field. Theseinclude spatial random fields [23, 26, 24], and multiscaletree-structured models [40, 55]. Much of the structure inthe variance field may be attributed to discontinuous fea-tures such as edges, lines, or corners. There is a substan-tial literature in computer vision describing such struc-tures [e.g., 57, 32, 17, 27, 56], but it has proven difficultto establish models that are both explicit and flexible. Fi-nally, there have been several recent studies investigat-ing geometric regularities that arise from the continuityof contours and boundaries [45, 16, 19, 21, 60]. These andother image structures will undoubtedly be incorporatedinto future statistical models, leading to further improve-ments in image processing applications.

References

[1] F. Abramovich, T. Besbeas, and T. Sapatinas. Em-pirical Bayes approach to block wavelet function es-timation. Computational Statistics and Data Analysis,39:435–451, 2002.

9

[Simoncelli, ‘97; Wainwright&Simoncelli, ‘99]

• Nearby: densities are approximately circular/elliptical

• Distant: densities are approximately factorial

Page 59: NIPS2008: tutorial: statistical models of visual images

ICA-transformed joint densitiesd=2 d=16 d=32

kurto

sis

orientationdata (ICA’d): factorialized:sphericalized:

4

6

8

10

12

0 !/4 !/2 3!/4 !4

6

8

10

12

0 !/4 !/2 3!/4 !4

6

8

10

12

0 !/4 !/2 3!/4 !

Page 60: NIPS2008: tutorial: statistical models of visual images

ICA-transformed joint densitiesd=2 d=16 d=32

kurto

sis

orientationdata (ICA’d): factorialized:sphericalized:

4

6

8

10

12

0 !/4 !/2 3!/4 !4

6

8

10

12

0 !/4 !/2 3!/4 !4

6

8

10

12

0 !/4 !/2 3!/4 !

• Local densities are elliptical (but non-Gaussian)• Distant densities are factorial

[Wegmann&Zetzsche ‘90; Simoncelli ’97; + many recent models]

Page 61: NIPS2008: tutorial: statistical models of visual images

3 6 9 12 15 18 200

0.05

0.1

0.15

0.2

kurtosis

blk size = 3x3

blksphericalfactorial

Spherical vs LTF

3x3 7x7 15x15

3 6 9 12 15 18 200

0.05

0.1

0.15

0.2

kurtosis

blk size = 7x7

blksphericalfactorial

3 6 9 12 15 18 200

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

kurtosis

blk size = 11x11

blksphericalfactorial

data (ICA’d): factorialized:sphericalized:

• Histograms, kurtosis of projections of image blocks onto random unit-norm basis functions.• These imply data are closer to spherical than factorial

[Lyu & Simoncelli 08]

Page 62: NIPS2008: tutorial: statistical models of visual images

- Zetzsche & Krieger, 1999;- Huang & Mumford, 1999; - Wainwright & Simoncelli, 2000; - Hyvärinen and Hoyer, 2000; - Parra et al., 2001; - Srivastava et al., 2002; - Sendur & Selesnick, 2002; - Teh et al., 2003; - Gehler and Welling, 2006- Lyu & Simoncelli, 2008- etc.

non-Gaussian elliptical observations and models of natural images:

Page 63: NIPS2008: tutorial: statistical models of visual images

• is Gaussian, • and are independent• is elliptically symmetric, with covariance • marginals of are leptokurtotic

[Wainwright&Simoncelli 99]

Modeling heteroscedasticityMethod 2: Hidden scaling variable for each patchGaussian scale mixture (GSM)[Andrews & Mallows 74]:

!u

!x =!

z!u

z

!x

!u z > 0

!x ! Cu

Page 64: NIPS2008: tutorial: statistical models of visual images

• Empirically, z is approximately lognormal [Portilla etal, icip-01]

• Alternatively, can use Jeffrey’s noninformative prior [Figueiredo&Nowak, ‘01; Portilla etal, ‘03]

GSM - prior on z

pz(z) =exp (!(log z ! µl)2/(2!2

l ))

z(2"!2

l )1/2

pz(z) ! 1/z

Page 65: NIPS2008: tutorial: statistical models of visual images

GSM simulation

!!" " !"#"

"

#"!

!!" " !"#"

"

#"!

Image data GSM simulation

[Wainwright & Simoncelli, NIPS*99]

Page 66: NIPS2008: tutorial: statistical models of visual images

Model III (GSM)

Basis set: Image:Coefficient density:

!

!

!

"#$%&'(

!

!

!

)

Page 67: NIPS2008: tutorial: statistical models of visual images

Original coefficients Normalized by!

z

marginal

!500 0 500

!10

!8

!6

!4

!2

Log p

robabili

ty!5 0 5

!10

!9

!8

!7

!6

!5

!4

Log p

robabili

ty

joint

!100 !50 0 50 100!100

!50

0

50

100

0 2 4 6 80

2

4

6

8

subband

IBM, 10/03

[Schwartz&Simoncelli 01]

[Ruderman&Bialek 94]

Page 68: NIPS2008: tutorial: statistical models of visual images

First Order IdealConditional Model

1 2 3 4 5 61

2

3

4

5

6

Empirical Conditional Entropy

Mod

el E

ncod

ing

cost

(bits

/coe

ff)

Gaussian Model Generalized Laplacian

3 4 5

2.5

3

3.5

4

4.5

5

5.5

Empirical First Order Entropy (bits/coeff)

Mod

el E

ncod

ing

Cost

(bits

/coe

ff)

[Buccigrossi & Simoncelli 99]

Page 69: NIPS2008: tutorial: statistical models of visual images

Bayesian denoising

• Additive Gaussian noise:

• Bayes’ least squares solution is conditional mean:

y = x + w

P (y|x) ! exp["(y " x)2/2!2

w]

x̂(y) = IE(x|y)

=

!dxP(y|x)P(x)x/P(y)

Page 70: NIPS2008: tutorial: statistical models of visual images

I. Classical

If signal is Gaussian, BLS estimator is linear:

den

oise

d(x̂

)

noisy (y)

x̂(y) =!

2x

!2x

+ !2n

· y

=> suppress fine scales, retain coarse scales

Page 71: NIPS2008: tutorial: statistical models of visual images

Non-Gaussian coefficients

[Burt&Adelson ‘81; Field ‘87; Mallat ‘89; Daugman ‘89; etc]

!!"" " !""#"

!$

#"!%

#""

&'()*+,-*./01.*

2+0343'(')5

-*./01.*,6'.)07+4894:..'41,;*1.')5,,

Page 72: NIPS2008: tutorial: statistical models of visual images

II. BLS for non-Gaussian prior• Assume marginal distribution [Mallat ‘89]:

• Then Bayes estimator is generally nonlinear:

P (x) ! exp"|x/s|p

p = 2.0 p = 1.0 p = 0.5

[Simoncelli & Adelson, ‘96]

Page 73: NIPS2008: tutorial: statistical models of visual images

MAP shrinkage

p=2.0 p=1.0 p=0.5

[Simoncelli 99]

Page 74: NIPS2008: tutorial: statistical models of visual images

Denoising: JointIE(x|!y) =

!

dz P(z|!y) IE(x|!y, z)

=!

dz P(z|!y)"

#zCu(zCu + Cw)!1!y$

%

ctr

where

P(z|!y) =P(!y|z) P(z)

P!y, P(!y|z) =

exp(!!yT (zCu + Cw)!1!y/2)&

(2")N |zCu + Cw|

Numerical computation of solution is reasonably efficient ifone jointly diagonalizes Cu and Cw ...

[Portilla, Strela, Wainwright, Simoncelli, ’03]

IPAM, 9/04 20

Page 75: NIPS2008: tutorial: statistical models of visual images

Example estimators

Estimators for the scalar and single-neighbor cases

NOISY COEFF.

ES

TIM

AT

ED

CO

EF

F.

!w

!!"

"

!"

!#"

"

#"

!!"

"

!"

$%&'()*%+,,-$%&'()./0+$1

+'1&2/1+3)*%+,,-

[Portilla etal 03]

Page 76: NIPS2008: tutorial: statistical models of visual images

Comparison to other methods

Results averaged over 3 images

!" #" $" %" &"!$'&

!$

!#'&

!#

!!'&

!!

!"'&

"

"'&

()*+,-*.)/-0123

/456,(74-.)/-0123

)89!:1:;<=>?.=9@A?-9?=@BC8D

!" #" $" %" &"!$'&

!$

!#'&

!#

!!'&

!!

!"'&

"

"'&

()*+,-*.)/-0123

,456748+91:;<=:>6965#8*>?6<

[Portilla etal 03]

Page 77: NIPS2008: tutorial: statistical models of visual images

Original Noisy(22.1 dB)

Matlab’swiener2(28 dB)

BLS-GSM(30.5 dB)

Page 78: NIPS2008: tutorial: statistical models of visual images

Original Noisy(8.1 dB)

UndWvltThresh

(19.0 dB)

BLS-GSM(21.2 dB)

Page 79: NIPS2008: tutorial: statistical models of visual images

Real sensor noise

400 ISO denoised

Page 80: NIPS2008: tutorial: statistical models of visual images

GSM summary• GSM captures local variance

• Underlying Gaussian leads to simple computation

• Excellent denoising results

• What’s missing?

• Global model of z variables [Wainwright etal 99; Romberg etal ‘99; Hyvarinen/Hoyer ‘02; Karklin/Lewicki ‘02; Lyu/Simoncelli 08]

• Explicit geometry: phase and orientation

Page 81: NIPS2008: tutorial: statistical models of visual images

Global models for z• Non-overlapping neighborhoods, tree-structured z

[Wainwright etal 99; Romberg etal ’99]

• Field of GSMs: z is an exponentiated GMRF, is a GMRF, subband is the product[Lyu&Simoncelli 08]

z

Coarse scale

Fine scale

!u

!u

Page 82: NIPS2008: tutorial: statistical models of visual images

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. X, NO. X, XX 200X 9

Barbara Lena Boats

! "#"! $! !# %! "##

!&

!'

!$

!"

#

"

!

"()*+,

! "#"! $! !# %! "##

!&

!'

!$

!"

#

"

!

"()*+,

! "#"! $! !# %! "##

!&

!'

!$

!"

#

"

!"()*+,

Fig. 6. Performance comparison of denoising methods for three di!erent images. Plotted are di!erences in PSNR for di!erent input noise levels (!) betweenFoGSM and four other methods (! BM3D [37], " BLS-GSM [17], # kSVD [39] and $ FoE [27]). The PSNR values for these methods were taken fromcorresponding publications.

original image noisy image (! = 50) (PSNR = 14.15dB)

local GSM [17] (PSNR = 25.45dB) FoGSM (PSNR = 26.40dB)

Fig. 7. Denoising results using local GSM [17] and FoGSM.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

FoE

kSVD

GSM

BM3DFoGSM

[Lyu&Simoncelli, PAMI 08]

State-of-the-art denoising

Page 83: NIPS2008: tutorial: statistical models of visual images

2-band steerable pyramid: Image decomposition in terms of multi-scale gradient measurements

Measuring Orientation

[Simoncelli et.al., 1992; Simoncelli & Freeman 1995]

Page 84: NIPS2008: tutorial: statistical models of visual images

Multi-scale gradient basis

Page 85: NIPS2008: tutorial: statistical models of visual images

Multi-scale gradient basis

• Multi-scale bases: efficient representation

Page 86: NIPS2008: tutorial: statistical models of visual images

Multi-scale gradient basis

• Multi-scale bases: efficient representation

• Derivatives: good for analysis

• Local Taylor expansion of image structures

• Explicit geometry (orientation)

Page 87: NIPS2008: tutorial: statistical models of visual images

Multi-scale gradient basis

• Multi-scale bases: efficient representation

• Derivatives: good for analysis

• Local Taylor expansion of image structures

• Explicit geometry (orientation)

• Combination:

• Explicit incorporation of geometry in basis

• Bridge between PDE / harmonic analysis approaches

Page 88: NIPS2008: tutorial: statistical models of visual images

orientation

orientation

magnitude

[Hammond&Simoncelli 06; cf. Oppenheim and Lim 81]

Page 89: NIPS2008: tutorial: statistical models of visual images

Importance of local orientationRandomized orientation Randomized magnitude

[Hammond&Simoncelli 05]

Page 90: NIPS2008: tutorial: statistical models of visual images

Reconstruction from orientation

• Reconstruction by projections onto convex sets

• Resilient to quantization

Quantized to 2 bits

[Hammond&Simoncelli 06]

Original

Page 91: NIPS2008: tutorial: statistical models of visual images

Image patches related by rotation

two-band steerable pyramid coefficients[Hammond&Simoncelli 06]

Page 92: NIPS2008: tutorial: statistical models of visual images

raw patches

rotated patches

--- Raw Patches Rotated Patches

PCA of normalized gradient patches

[Hammond&Simoncelli 06]

Page 93: NIPS2008: tutorial: statistical models of visual images

Orientation-Adaptive GSM model

patch rotation operator

hidden magnitude/orientation variables

Model a vectorized patch of wavelet coefficients as:

[Hammond&Simoncelli 06]

Page 94: NIPS2008: tutorial: statistical models of visual images

Orientation-Adaptive GSM model

patch rotation operator

hidden magnitude/orientation variables

Model a vectorized patch of wavelet coefficients as:

Conditioned on ; is zero mean gaussian with covariance

[Hammond&Simoncelli 06]

Page 95: NIPS2008: tutorial: statistical models of visual images

[Hammond&Simoncelli 06]

Estimation of C(θ) from noisy data

noisy patch

unknown, approximate by measured from noisy data.

Assuming independent and noise rotationally invariant

(assuming w.l.o.g. E[z] =1 )

Page 96: NIPS2008: tutorial: statistical models of visual images

Bayesian MMSE Estimator

[Hammond&Simoncelli 06]

Page 97: NIPS2008: tutorial: statistical models of visual images

Bayesian MMSE Estimatorcondition on and integrate over hidden variables

[Hammond&Simoncelli 06]

Page 98: NIPS2008: tutorial: statistical models of visual images

Bayesian MMSE Estimatorcondition on and integrate over hidden variables

[Hammond&Simoncelli 06]

Page 99: NIPS2008: tutorial: statistical models of visual images

Bayesian MMSE Estimatorcondition on and integrate over hidden variables

Wiener estimate

[Hammond&Simoncelli 06]

Page 100: NIPS2008: tutorial: statistical models of visual images

Bayesian MMSE Estimatorcondition on and integrate over hidden variables

has covariance

separable prior forhidden variables

Wiener estimate

[Hammond&Simoncelli 06]

Page 101: NIPS2008: tutorial: statistical models of visual images

Bayesian MMSE Estimatorcondition on and integrate over hidden variables

has covariance

separable prior forhidden variables

Wiener estimate

[Hammond&Simoncelli 06]

Page 102: NIPS2008: tutorial: statistical models of visual images

oagsm 13.1 dB

gsm2 12.4 dB

σ = 40

noisy 2.81 dB

Page 103: NIPS2008: tutorial: statistical models of visual images

Locally adaptive covariance• Karklin & Lewicki 08: Each patch is Gaussian,

with covariance constructed from a weighted outer-product of fixed vectors:

• Guerrero-Colon, Simoncelli & Portilla 08: Each patch is a mixture of GSMs (MGSMs):

p(!x) =!

k

Pk

"p(zk) G(!x; zkCk) dzk

p(!y) =!

n

exp(!|yn|) Bn =!

k

wnk!bk

!bkT

log C(!y) =!

n

ynBnp(!x) = G (!x;C(!y))

Page 104: NIPS2008: tutorial: statistical models of visual images

MGSMs generative model!xPatch chosen from

with probabilities

Parameters:• Covariances

• Scale densities• Component probabilities• Number of components

Ck

{!

z1!u1,!

z2!u2, ...!

zK!uK}

{P1, P2, ..., PK}

pk(zk)

Pk

K

Parameters can be fit to data of one or more images by maximizing likelihood (EM-like)

[Guerrero-Colon, Simoncelli, Portilla 08]

Page 105: NIPS2008: tutorial: statistical models of visual images

MGSM “segmentation”

First sixeigenvectors

of GSMcovariance

matrices

image 1 2 4

[Guerrero-Colon, Simoncelli, Portilla 08]

Page 106: NIPS2008: tutorial: statistical models of visual images

MGSM “segmentation”

Eigenvectors of GSM components represent

invariant subspaces:“generalized complex

cells”

Page 107: NIPS2008: tutorial: statistical models of visual images

Potential of local homogeneous models?

• marginal statistics [var,skew,kurtosis]

• local raw correlations

• local variance correlations

• local phase correlations

Consider an implicit model: maxEnt subject to constraints on subband coefficients:

[Portilla & Simoncelli 00; cf. Zhu, Wu & Mumford 97]

Page 108: NIPS2008: tutorial: statistical models of visual images

Visual texture

Page 109: NIPS2008: tutorial: statistical models of visual images

Visual texture

Homogeneous, with repeated structures

Page 110: NIPS2008: tutorial: statistical models of visual images

Visual texture

Homogeneous, with repeated structures

“You know it when you see it”

Page 111: NIPS2008: tutorial: statistical models of visual images

All Images

Texture Images

Equivalence class (visually indistinguishable)

Page 112: NIPS2008: tutorial: statistical models of visual images

Iterative synthesis algorithm

Synthesis

Analysis

Transform MeasureStatistics

ExampleTexture

RandomSeed

SynthesizedTexture

Transform

MeasureStatistics

Adjust InverseTransform

[Portilla&Simoncelli 00; cf. Heeger&Bergen ‘95]

Page 113: NIPS2008: tutorial: statistical models of visual images

Examples: Artificial

Page 114: NIPS2008: tutorial: statistical models of visual images

Photographic, quasi-periodic

Page 115: NIPS2008: tutorial: statistical models of visual images

Photographic, aperiodic

Page 116: NIPS2008: tutorial: statistical models of visual images

Photographic, structured

Page 117: NIPS2008: tutorial: statistical models of visual images

Photographic, color

Page 118: NIPS2008: tutorial: statistical models of visual images

Non-textures?

Page 119: NIPS2008: tutorial: statistical models of visual images

Texture mixtures

Page 120: NIPS2008: tutorial: statistical models of visual images

Texture mixtures

Convex combinations in parameter space

Page 121: NIPS2008: tutorial: statistical models of visual images

Texture mixtures

Convex combinations in parameter space=> Parameter space includes non-textures

Page 122: NIPS2008: tutorial: statistical models of visual images

Summary

• Fusion of empirical data with structural principles

• Statistical models have led to state-of-the-art image processing, and are relevant for biological vision

• Local adaptation to {variance, orientation, phase, ...} gives improvement, but makes learning harder

• Cascaded representations emerge naturally

• There’s still much room for improvement!

Page 123: NIPS2008: tutorial: statistical models of visual images

Cast• Local GSM model: Martin Wainwright, Javier Portilla

• GSM Denoising: Javier Portilla, Martin Wainwright, Vasily Strela

• Variance-adaptive compression: Robert Buccigrossi

• Local orientation and OAGSM: David Hammond

• Field of GSMs: Siwei Lyu

• Mixture of GSMs: Jose-Antonio Guerrero-Colón, Javier Portilla

• Texture representation/synthesis: Javier Portilla