Statistics of natural images

May 30, 2010Ofer BartalAlon Faktor

Outline

• Motivation• Classical statistical models• New MRF model approach• Learning the models• Applications and results

Motivation

• Big variance in appearance • Can we even dream of modeling this?

Motivation

• Main questions:– Do all natural images obey some common

“rules”?– How can one find these “rules”?– How to use “rules” for computer vision

tasks?

Motivation

• Why bother to model at all?

• “Noise”, uncertainty

• Model helps choose the “best” possible answer

• Lets see some examples

Natural image model

Noise-blur removal

• Consider the classical De-convolution problem

• Can be formulated as linear set of equations:

?Y h X N X

cs cs csy Hx n

Noise-blur removal

Inpainting

Y AX n

1 0 0 ... 00 1 0 0 ... 00 0 0 0 0 1 0 0 ... 00 0 0 0 0 0 1 0 ... 00 0 .... 0 1

Missing lines of identity matrix = missing pixels (under-determined system)

Motivation

• Problems: – Unknown noise– H may be singular (Deconvolution)– H may be under-determined (Inpainting)

• So there can be many solutions. • How can we find the “right” one?

Motivation

• Goal: Estimate x– Assume:

• Prior model of natural image:• Prior model of noise:

– Use MAP estimator to find x:

* arg max ( | ) arg max ( | ) ( )x x

x P x y P y x P x

( )xP x

* arg max ( ) ( )n xx

x P y Hx P x

( )nP n

Energy Minimization problem

• The MAP problem can be reformulated as:

data term( | )+prior term( ) ˆ arg min

E y x xx E

Classical models

• Smoothness prior (model of image gradients) – Gaussian prior (LS problem)– L1 Prior and sparse prior (IRLS problem)

Image gradient

Gaussian Priors

• Assume:

– Gaussian priors on gradients of x:

– Gaussian noise:• Using this assumption:

221( )2

* arg min 2T T

xx x Tx x b

* arg max ( ) ( )n xx

x P y Hx P x

2~ (0, )n N

Non-Gaussian Priors

• Empirical results: image gradients have a Non-Gaussian heavy tailed distribution

• We assume L1 or sparse prior• We solve it by IRLS –iterative re-weighted LS

De-convolution Results

Gaussian prior Sparse priorBlurred image

Good results on simple images

De-noising Results

De-noising resultNoisy image

Poor results on real natural images

Classical models – Pro’s and Con’s

• Advantages:– Simple and easy to implement

• Disadvantages:– Too Heuristic– Only one property - Smoothness– Bias towards totally smooth images:

Going Beyond Classical Models

0 1 2 3 40

number of similar patches (in log10 scale)

0 1 2 3 40

Modern Approach

• Model is based on image properties• Choose properties using image dataset

• Questions:1. What types of properties?

Responses to linear filters.2. How to find good properties?

Either pre-determined bank or learn from data.3. How should combine properties to one distribution?

We will see how.

Mathematical framework

• Want: A model p(I) of real distribution f(I).• Computationally hard:

– A 100x100 pixel image has 10,000 variables• Can explicitly model only a few dimensions at a time

Arrow = viewpoint of few dimensions

• A viewpoint is a response to a linear filter• A distribution over these responses is a

marginal of real distribution f(I)• (Marginal = Distribution over a subset of variables)

Arrow = marginal of f(I)

• If p(I) and f(I) have the same marginal distributions of linear filters then p(I)=f(I) (proposition by Zhu and Mumford)

• “Hope”: If we will choose K “good” filters then p(I) and f(I) will be “close”.

How do we measure “close?”

Distance between distributions

• Kullback-Leibler divergence:

• Problem - f(I) unknown• Proposition - use instead:

• Measures fit of model to observations

( ), ( ; , ) log ( ) log ( ; , )f fKL f I p I S E f I E p I S

( ), ( ; , ) log ( ; , ) log ( ; , )P X

KL f I p I S p I S p I S

( ; , )p I S X

Illustration

log ( ; , )p I S

( ), ( ; , ) log ( ; , ) log ( ; , )P X

KL f I p I S p I S p I S

Getting synthesized images

• Get synthesized images by sampling the learned model

• Sample using Markov Chain Monte Carlo (MCMC).

• Drawback: Learning process is slow

Our model P(I) – A MRF

• MRF = Markov Random Field• A MRF is based on a graph G=(V,E).

V – pixels E – between pixels that affect each other

• Our distribution is the MRF:

( )1( ) exp ( )c c

c Cliques

p I U IZ

Simple grid MRF

• Here, cliques are edges• Every pixel belongs to 4 cliques

• We limit ourselves to:

– Cliques of fixed size (over-lapping patches)

– Same for all cliques

• We get:

( ) ( )( ) ( )

( ) ( )K

Tc cU I F I

( ) ( )( )

1( ) exp ( )C K

p I F IZ

MRF simulation

( ) ( )( )

1( ) exp ( )C K

p I F IZ

Histogram simulation

( ) ( )obsnH I

Histogram of a marginal

• In terms of convolutions:

• Denote: Set of potential functions:

• Denote: Set of filters:

1...KS F

( ) ( )

1 ( , )

,1( ; , )

F I x y

p I S eZ

MRF - A simple example

• Cliques of size 1• Pixels are i.i.d and distributed by grayscale histogram

grayscale histogram

Drawback: cliques are too small

MRF - Another simple example

• Clique = whole image• Result: Uniform distribution on images in dataset

Drawback: cliques are too big

Revisiting classical models

• Actually, the classical model is a pairwise MRF:

• Has cliques of size 2:

• Has only 2 linear filters => 2 marginals

• No guarantee that p(I) will be close to f(I)

( , )( ( , )) ( ( , ))1( ) x yx y

I x y I x yp I e

Zhu and Mumford’s approach (1997)

• We want to find K “good” filters• Strategy:

– Start off with a bank B of possible filters– Choose subset that minimizes the

distance between p(I) and f(I)– For computational reasons, choose filters one by

one using a greedy method

, | |S B S K

Choosing the next filter

• AIG = the difference between the model p(I) and the data from the viewpoint of marginal

• AIF = the difference in between different images in dataset from the viewpoint of marginal

( ) ( ) ( )IC AIG AIF

( ) ( )( ; , )

( ) ( )

1( ) ( ) ( )21( ) ( )

Mobsn P I S

AIG H I E H IM

AIF H IM

Algorithm – Filter selection

Bank of filters

IC arg max

Model ( )learn

Learning the potentials

( ) Model

Calculate update

(Using maximum entropy on P)

The bank of filters

• Filter types: – Intensity filter (1X1)– Isotropic filters - Laplacian of Gaussian (LG, )– Directional filters - Gabor (Gcos, Gsin)

• Computation in different scales - image pyramid

Laplacian of Gaussian Gabor

Running example of algorithmExperiment I

Use only small filters

Results

All learned potentials have a diffusive nature

( ) ( )( )

1( ) exp ( )C K

p I F IZ

Running example of algorithmExperiment II

• Only gradient filters, in different scales• Small filters -> diffusive potential (as expected)• Surprisingly: Large filters -> reactive potentials

Diffusive Reactive

Examples of the synthesized images

Experiment I Experiment II

This image is more “natural” because it has some regions with sharp boundaries

Outline

• We have seen:– MRF models – Selection of filters from a bank – Learning potentials

• Now:– Data-driven filters – Analytic results for simple potentials– Making sense in results– Applications

Roth and Black’s approach

filters potentials

Chosen from bank Learn a-parametricallyX XLearn from data Learn parametrically

Learn together

Motivation – model of natural patches

• Why learn filters from data?• Inspiration from models of natural patches:

– Sparse coding– Component analysis– Product of experts

Motivation – Sparse Coding of patches

• Goal: find a set s.t.

• Learn from database of natural patches

• Only few filters should fire on a given patch

, are sparseN

i i ii

patch a F a

,i ia patch F

1 2 3 4 5

Motivation – Component analysis

• Learn by component analysis:– PCA– ICA

• Results in “filters like” components– PCA – first components look like contrast filters– ICA - components look like Gabor filters

PCA results

ICA results

• Independent filters • Can derive model for patches:

( ) ( )n

P x p F x

Motivation – Product of experts

• More sophisticated model for natural patches:

• Training of MLE => “intuitive” filters:

1( ; ) ( ; ), , ( ) 1( 2, )

POE i sti ii

F F zzZ F

texturecontrast

• extension of POE to FOE:

Field of experts (FOE)

( )1 1

1( ; ) exp ( ; ) ,,( , )

FOE i iii

F Fp I IFZ

log( )st

( )1 1

( ; )iC K

TFOE i c

Roth S., Black M. J., Fields of experts IJCV, 2009

The experts

• Student-t experts2

( ) 12

( )st z

Meaning of

• Higher means:– Punishes high responses more severely – A filter with higher weight

( )st z

1( ; , ) exp ( , )

g log 12

FOE i i i iii i

p I F gZ F

Learning the model

log ( ; , )p I S

MCMCinit

random

Results of learning FOE

Filters aren’t “intuitive”

So far…

filters potentials

Chosen from bank Learned a-parametrically

diffusive reactive

Small filters Large filters

non-intuitive?

So far…

filters potentials

Learned from database Learned parametrically

non-intuitive?

What now?• Revisiting POE and FOE with Gaussian

potentials• Relation to non-Gaussian potentials• Making sense of previous results

Weiss Y., Freeman W. T. What makes a good model of natural images?. CVPR, 2007

Gaussian POE

1; exp ( )( )

ln ; ( ) ln ( )

arg min ( ) ln ( )i

GPOE i iii

GPOE i i ii

i i iML F i

p x F F xZ F

p x F F x Z F

F F x Z F

• Claim: Z is constant for any set of K orthonormal vectors

• This has an analytic solution – the K minor components of the data

Gaussian POE

arg min ( )i

i iF orthonormal i

• Non-intuitive high-frequency filters• Reminder - PCA

ResultsExample of learned filters

Gaussian FOE

2 2( )

1 1 1 ,

1( ;{ }) exp ( )({ })

( ) ( * ) ( , )

{ }( ) { }( ) ( ) { }( )

ln ({ }) ln { }( ) ln ( )

GFOE i i ci ci

K C KTi c i

i c i x y

p I F F IZ F

F I F I x y

F I G I

Z F F G

Gaussian FOE

( )1 1

arg min ( ) ln ( )

( ) arg min ( ) { }( ) ln ( )

1( ){ }( )

i i c iML F i c

F F I Z F

G G I G

Gaussian FOE

• satisfies:

=> Optimal filters have high frequencies

1{ }( ){ }( )

{ }( )K

2{ }( )I

• Non-Gaussian potentials -> modeled by GSM

• Properties of GFOE hold for GSM

Gaussian Scale Mixture (GSM)

Revisiting FOE

• Student t expert – fit GSM• Filters have the property of

Natural image Roth and Black filters

1{ }( ){ }( )

high-frequency filters

Learning FOE with fixed filters

Algorithm prefers high-frequency filters

Conclusion

• For Gaussian potentials and GSM’s:learning => High frequency filters

• Experimental evidence to this phenomena • Maybe there is a “logic” behind this non-intuitive

result?

Making Sense of results

• Criterion for “good” filters for patches – Rarely fire on natural images and fire frequently on all other images

Patches from Natural images

Histogram of filter responses

White noise

Making Sense of results

• An image was modeled by what you don’t expect to find in it

• This is satisfied by the classical prior of smooth gradients

• But why limit ourselves to intuitive filters?• Maybe non-intuitive filters can do better…

reactivediffusive

White noise

Revisiting diffusive and reactive potentials

White noise

Inference

• We learned a model• We can use it for inference problems

– Corrupted information– Missing information

• Exact inference – Loopy BP • Approximate inference - gradient based

optimization

Belief Propagation

• Observed data is incorporated to model byiy i

Belief Propagation

Message passing Algorithm

• Exact only on tree MRFs • Efficient only on pairwise MRFs

Alternative by Roth and Black

• Reminder:

• Approximate inference by gradient-based optimization :

• Advantage: Low computational cost• Drawback: only local minimum if not convex

= argmin ( ( | )) ( ( ))MAPI

I Log P I I Log P I

Uncertainty \Noise model Learned model

( 1) ( ) , ( , )t tII I I I E I I

Partition function

=> No need to estimate partition function

• We get:

( , ) 1

arg min ( ( | )) * ( , );n

i iI x y i

Log P I I F I x y

( , ) 1

argmin ( ( | )) log( ( , )) * ( , );n

i i i iI x y i

Log P I I Z F F I x y

(Doesn’t depend on )

The gradient step

( ) ( )

( , ) 1 1

* ( , ); * '( * ; )n N

i iI i i i

x y i i

F I x y F F I

( )iF( )iF

• How to derivate the second term?• By a mathematical “trick” we get:

• Assume Gaussian noise

• So the Gradient step is:

De-noising

( | ) ( )1( ( | )) ( )

nP I I P I I

Log P I I I I

( ) ( )2

1 ( ) * '( * ; )N

I I I F F I

Results

Original Noisy(20.29dB)

FOE(28.72dB)

Poritilla (Wavelets)(28.9dB)

Non-local means

(28.21dB)

StandardNon-Linear diffusion (27.18dB)

State of the art

Generalprior

Results on Berkeley databaseWiener filter

Non-Linear diffusion FOE

Poritilla1Poritilla2

Low noise

High noise

Input PSNRLow noise

High noise

Input PSNR

How many 3x3 filters to take?

Number of filters

Size of filter – 3X3Performance start saturating when we reach 8 filters

Dependence on size and shape of clique

What is the best filter?

Inpainting - Reminder

Y AX n

Problem: pixels outside mask can change

Solution: constraint them

Inpainting

• Assume pixels outside mask M don’t change

• So the gradient step is: ( ) ( )

* '( * ; )N

I M F F I

Advanced Topics In Computer

Vision CourseSpring 2010

Advanced Topics In Computer

Vision CourseSpring 2010

0-1 Mask Image we want to inpaint98

Results

ResultsFOE Bertalmio

FOE Bertalmio

PSNR 29.06dB 27.56dB

SSIM 0.9371 0.9167

Pro’s and Con’s

• Perform well on narrow straws or small holes (even if they cover most of the image)

• Isn’t able to fill large holes• Isn’t designed to handle textures

Thank you for Listening…

Statistics of natural images

Documents

Transcript of Statistics of natural images

Natural Image Statistics and Divisive Normalization ...wainwrig/Papers/rao_chapter00.pdf · Natural Image Statistics and Divisive Normalization: Modeling Nonlinearities and Adaptation

Statistics of Natural Images: Scaling in the Woods

Importance of monthly statistics on Natural Gas

Modelling the Statistics of Natural Images with ...welling/publications/papers/PoTjournal_Draft16.pdf · Modelling the Statistics of Natural Images with Topographic Product of Student-t

Manitoba Natural Resource Statistics

Siwei lyu: natural image statistics

Natural Image Statistics · Natural Image Statistics ... 1.3 The magic of your visual system ... 4.1 Natural images patches as random vectors ...

INDIAN PETROLEUM & NATURAL GAS STATISTICS 2012 … 2012-13.pdf · indian petroleum & natural gas statistics 2012-13 government of india ministry of petroleum & natural gas economic

The Nonlinear Statistics of High-Contrast Patches in Natural Images ...

Natural Image Statistics and Neural Representationshz338/slides/CSE254_Natural_Imgs.pdf · Natural Image Statistics and Neural Representation Eero P Simoncelli Center for Neural Science

Learning to Recognize Shadows in Monochromatic Natural …eecs.ucf.edu/~mtappen/pubs/cvpr10_shadow.pdfshadow/non-shadow segments from all individual images in the dataset. These statistics

Natural Sound Statistics and Divisive Normalization in the ...papers.nips.cc/paper/1860-natural-sound-statistics-and-divisive... · Natural sound statistics and divisive normalization

Natural gas statistics - Microsoft › assets › imports › events › 139 › Se… · Natural gas statistics Mieke Reece Oil and Gas Statistics IEA Energy Data Centre Joint Rosstat

Disparity statistics in natural scenes - Carnegie Mellon Universitysamondjm/papers/Liuetal2008.pdf · Disparity statistics in natural scenes Department of Electrical and Computer

A Database of Human Segmented Natural Images …...Evaluating Segmentation Algorithms and Measuring Ecological Statistics David Martin Charless Fowlkes Doron Tal Jitendra Malik Department

Blind Detection of Photomontages Using Higher Order Statistics...Definition: What is the quality of authentic images? Natural-imaging Quality Entailed by natural imaging process with

SAR images statistics and preprocessing - Earth …earth.esa.int/landtraining07/D3PB-4-LeToan.pdf · SAR images statistics and preprocessing. September 6,2007 D3Pb-4 SAR Statistics

NATURAL RUBBER STATISTICS 2014

Classifying Photographic and Photorealistic Computer ...€¦ · Classifying Photographic and Photorealistic Computer Graphic Images using Natural Image Statistics Tian-Tsong Ng,

Natural Scene Statistics Account for the …amygdala.psychdept.arizona.edu/Jclub/Natural-scenes+2013.pdfNeuron Article Natural Scene Statistics Account for the Representation of Scene