Variational Bayesian Image Processing on Stochastic Factor Graphs

Variational Bayesian Image Processing on Stochastic Factor Graphs

Xin LiXin Li

Lane Dept. of CSEELane Dept. of CSEE

West Virginia UniversityWest Virginia University

OutlineOutline Statistical modeling of natural imagesStatistical modeling of natural images

From old-fashioned local models to newly-propFrom old-fashioned local models to newly-proposed nonlocal models osed nonlocal models

Factor graph based image modelingFactor graph based image modeling A powerful framework unifying local and nonlA powerful framework unifying local and nonl

ocal approachesocal approaches EM-based inference on stochastic factor grEM-based inference on stochastic factor gr

aphsaphs Applications and experimental resultsApplications and experimental results

Denoising, inpainting, interpolation, post-procDenoising, inpainting, interpolation, post-processing, inverse halftoning, deblurring ... ...essing, inverse halftoning, deblurring ... ...

Xin Li

This slide describes the scientific basis of using patches as the units of modeling images: human vision system processes the stimuli through overlapping receptive fields;and engineering concepts of patch: it has appeared in many different forms.

Cast Signal/Image Cast Signal/Image Processing Under a Processing Under a

Bayesian FrameworkBayesian Framework Image restoration (Besag Image restoration (Besag

et al.’1991)et al.’1991) Image denoising Image denoising

(Simoncelli&Adelson’199(Simoncelli&Adelson’1996)6)

Interpolation Interpolation (Mackay’1992) and super-(Mackay’1992) and super-resolution (Schultz& resolution (Schultz& Stevenson’1996 )Stevenson’1996 )

Inverse halftoning Inverse halftoning (Wong’1995)(Wong’1995)

Image segmentation Image segmentation (Bouman&Shapiro’1994)(Bouman&Shapiro’1994)

( | , ) ( | )( | , )

( | )

p H p Hp H

p H

y x xx y

y

x: Unobservable data

y: Observation data

Image prior(the focus of this talk)

Likelihood(varies from application

to application)

Xin Li

This talk is more about likelihood term than image prior (since I am just using BM3D to regularize the reconstruction).

Statistical Modeling of Statistical Modeling of Natural Images:Natural Images:

the Pursuit of a Good Priorthe Pursuit of a Good Prior Local modelsLocal models

Markov Random Field (MRF) and its extensions (e.g., 2Markov Random Field (MRF) and its extensions (e.g., 2D Kalman-filtering, Field-of-Expert)D Kalman-filtering, Field-of-Expert)

Sparsity-based: DCT, wavelets, steerable pyramids, geoSparsity-based: DCT, wavelets, steerable pyramids, geometric wavelets (edgelets, curvelets, ridgelets, bandelemetric wavelets (edgelets, curvelets, ridgelets, bandelets)ts)

Nonlocal modelsNonlocal models Bilateral filtering (Tomasi et al. ICCV’1998)Bilateral filtering (Tomasi et al. ICCV’1998) Texture synthesis (Efros&Leung ICCV’1999)Texture synthesis (Efros&Leung ICCV’1999) Exemplar-based inpainting (Criminisi et al. TIP’2004)Exemplar-based inpainting (Criminisi et al. TIP’2004) Nonlocal mean denoising (Buades et al.’ CVPR’2005)Nonlocal mean denoising (Buades et al.’ CVPR’2005) Total Least-Square denoising (Hirakawa&Parks TIP’2Total Least-Square denoising (Hirakawa&Parks TIP’2

006)006) Block-matching 3D denoising (Dabov et al. TIP’2007)Block-matching 3D denoising (Dabov et al. TIP’2007)

Xin Li

Althoug not a common view, it is possible to interpret various image models under a patch-based framework. The main difference between local and nonlocal models lies in the Markovian assumption they made: is it in the domain or the range? Such range-domain duality is the basis for bilateral filtering (arguably the first nonlocal model).

Introducing a New Introducing a New Language of Factor Graphs Language of Factor Graphs

Why Factor Graphs?Why Factor Graphs? The most general form of graphical probability models (both MRF The most general form of graphical probability models (both MRF

and Bayesian networks can be converted to FGs)and Bayesian networks can be converted to FGs) Widely used in computer science and engineering (Widely used in computer science and engineering (forward-

backward algorithm, Viterbi algorithm, turbo decoding algorithm, Pearl’s belief propagation algorithm, Kalman filter1))

What is Factor Graph?What is Factor Graph? a bipartite graph that expresses which variables are arguments of

which local functions Factor/function node (solid squares) vs. variable nodes (empty Factor/function node (solid squares) vs. variable nodes (empty

circles)circles)

B1 B2 B7 B8B3 B4 B5 B6

f1 f2 f3 f4

f1

f2

f3

f4

1,2,4

3,65,77,8

L:F V

1Kschischang, F.R.; Frey, B.J.; Loeliger, H.-A., "Factor graphs and the sum-product algorithm," IEEE Transactions on Information Theory,, vol.47, no.2, pp.498-519, Feb 2001

Xin Li

It might be fair to mention you guys' ICIP2007 work though the targeting application is different (I did not do anything like stochastic approximation here).

Variable Nodes=Image Variable Nodes=Image PatchesPatches

Neuroscience: Neuroscience: receptive fields of receptive fields of neighboring cells in neighboring cells in human vision system human vision system have severe have severe overlappingoverlapping

Engineering: patch Engineering: patch has been under the has been under the disguise of many disguise of many different names such different names such as as windowswindows in digital in digital filters, filters, blocksblocks in JPEG in JPEG and the and the supportsupport of of wavelet bases wavelet bases

Cited from D. Hubel, “Eye, Brain and Vision”, 1988

Xin Li

This slide describes the scientific basis of using patches as the units of modeling images: human vision system processes the stimuli through overlapping receptive fields;and engineering concepts of patch: it has appeared in many different forms.

Factorization: the Art of Factorization: the Art of Statistical Image ModelingStatistical Image Modeling

Wavelet-based statistical models(geometric proximity defines

the neighborhood)

Locally linear embedding1

(perceptual similarity defines the neighborhood)

SP ML

Domain-Markovian

Range-Markovian

1S.T. Roweis and L.K. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding”(22 December 2000),Science 290 (5500), 2323.

Unification Using Factor Unification Using Factor GraphsGraphs

f1 f2 f3 f4

B1 B2 B3 B4

naive Bayesian(DCT/wavelet-based models)

MRF-based

B0

B1B2

B3

x

B0 B1 B3B2

B0

B1

B2

B3

kNN/kmeans clustering(nonlocal image models)

A Manifold Interpretation A Manifold Interpretation of Nonlocal Image Priorof Nonlocal Image Prior

MRN

B1 Bk

B0

][ 10BΒD

]''[' 10 BΒD

0'Β1'Β

k'Β

How to maximize the sparsity of a representation?Conventional wisdom: adapt basis to signal (e.g., basis pursuit, matching pursuit)New proposal: adapt signal to basis (by probing its underlying organization principle)

Organizing Principle: Organizing Principle: Latent Variable LLatent Variable L

P(y|x)x y

image denoising

image inpainting

image coding

image halftoning

LB11

B22

B14B13B12

B41

B31

B21

B33B32

B23 B24

B34B44B43B42

fBfA

fC

image deblurring

Ff

jjFf

jj

jj

ffp )()()( 1STDx

)()1()0( kiiij BBBD sparsifying transform

“Nature is not economical of structures but organizing principles.” - Stanislaw M. Ulam

L

Maximum-Likelihood Maximum-Likelihood Estimation of Graph Estimation of Graph

Structure LStructure L

Pack into3D Array D

For. Trans.

Coring

B0 BkB1…

Inv. Trans.

unpack into2D patches

B0 BkB1…^ ^ ^

Update theestimate of L

Update theestimate of x

loop over every factor node fj

A variational interpretation of such EM-basedinference on FGs is referred to the paper

P(y|x)

Problem 1: Image Problem 1: Image DenoisingDenoising

PSNR(DB) PERFORMANCE COMPARISON AMONG DIFFERENT SCHEMES FOR 12 TEST IMAGES ATσw = 100

SSIM PERFORMANCE COMPARISON AMONG DIFFERENT SCHEMES FOR 12 TEST IMAGES ATσw = 100

BM3D(kNN,iter=2)

SFG(kmeans,iter=20) σw

org. 200 400 600 800 1000

Problem 2: Image Problem 2: Image RecoveryRecovery

top-down: test1, test3, test5

top-down: test2, test4, test6

DCT FoE EXP BM3D LSP SFG

PSNR(dB) performance comparison

SSIM performance comparison

Local models: DCT, FoE and LSPNonlocal models: EXP, BM3D1 and SFG1Our own extension into image recovery

x y

x y bicubic NEDI1 FG

28.70dB 27.34dB 28.19dB

31.76dB 32.36dB 32.63dB

34.71dB 34.45dB 37.35dB

18.81dB 15.37dB 16.45dB

Problem 3: Resolution Problem 3: Resolution EnhancementEnhancement

1X. Li and M. Orchard, “New edge directed interpolation”, IEEE TIP, 2001

Xin Li

This slides leads to the motivational observation about the limitation of uniform sampling - despite severe aliasing in the last example, the reconstructed image is visually very convincing.

29.06dB 31.56dB 34.96dB

x y DT KR FG1

28.46dB 31.16dB 36.51dB

17.90dB 18.49dB 29.25dB

26.04dB 24.63dB 29.91dB

Problem 4: Irregular Problem 4: Irregular InterpolationInterpolation

DT- DelauneyTriangle-based(griddata under MATLAB)

KR- KernalRegression-based(Takeda et al.IEEE TIP 2007w/o parameteroptimization)

1X. Li, “Patch-based image interpolation: algorithms and applications,” Inter. Workshop on Local and Non-Local Approximation (LNLA)’2008

25% kept

Xin Li

Nothing new here - just confirm nonuniform sampling could work better with our reconstruction algorithm.

Problem 5: Post-Problem 5: Post-processingprocessing

JPEG-decoded at rate of 0.32bpp(PSNR=32.07dB)

SFG-enhanced at rate of 0.32bpp(PSNR=33.22dB)

SPIHT-decoded at rate of 0.20bpp(PSNR=26.18dB)

SFG-enhanced at rate of 0.20bpp(PSNR=27.33dB)

Maximum-Likelihood (ML) Decoding

Maximum a Posterior (MAP) Decoding

Problem 6: Inverse Problem 6: Inverse HalftoningHalftoning

without nonlocal prior1

(PSNR=31.84dB,SSIM=0.8390)

with nonlocal prior(PSNR=32.82dB,SSIM=0.8515)

1Available from Image Halftoning Toolbox released by UT-Austin Researchers

Conclusions and Conclusions and PerspectivesPerspectives

Despite the rich structures in natural images, Despite the rich structures in natural images, the underlying organization principle is simple the underlying organization principle is simple (self-similarity(self-similarity We have shown how We have shown how similaritysimilarity can lead to can lead to sparsitysparsity

in a nonlinear representation of imagesin a nonlinear representation of images FG only represents one mathematical language for FG only represents one mathematical language for

interpreting such principle (multifractal formalism interpreting such principle (multifractal formalism is another)is another)

Image processing (low-level vision) could Image processing (low-level vision) could benefit from data clustering (higher-level benefit from data clustering (higher-level vision): how does human visual cortex learn to vision): how does human visual cortex learn to decode the latent variable L through decode the latent variable L through unsupervised learning?unsupervised learning?

Reproducible Research: MATLAB codes accompanying this work areavailable at http://www.csee.wvu.edu/~xinl/sfg.html (more will be added)

Xin Li

I don't know how many in audience know Prof. Kohonen in person. But to me, he is one of my idols - a true pioneer in the field of neural networks.

Variational Bayesian Image Processing on Stochastic Factor Graphs

Documents

Transcript of Variational Bayesian Image Processing on Stochastic Factor Graphs