Ref 20

download Ref 20

of 15

Transcript of Ref 20

  • 8/6/2019 Ref 20

    1/15

    Expectation-maximization algorithms for imageprocessing using multiscale models andmean- field theory, with applications to laser radar

    range profiling and segmentation

    Andy TsaiLaboratory for Information and Decision

    SystemsMassachusetts Institute of Technology77 Massachusetts AvenueCambridge, Massachusetts 02139E-mail: [email protected]

    Jun ZhangDepartment of Electrical Engineering and

    Computer Science

    University of WisconsinMilwaukeeMilwaukee, Wisconsin 53201E-mail: [email protected]

    Alan S. WillskyLaboratory for Information and Decision

    SystemsMassachusetts Institute of Technology77 Massachusetts AvenueCambridge, Massachusetts 02139E-mail: [email protected]

    Abstract. We describe a new class of computationally efficient algo-rithms designed to solve incomplete-data problems frequently encoun-tered in image processing and computer vision. The basis of this frame-work is the marriage of the expectation-maximization (EM) procedurewith two powerful methodologies. In particular, we have incorporatedoptimal multiscale estimators into the EM procedure to compute esti-mates and error statistics efficiently. In addition, mean-field theory (MFT)from statistical mechanics is incorporated into the EM procedure to helpsolve the computational problems that arise from our use of Markovrandom-field (MRF) modeling of the hidden data in the EM formulation.We have applied this algorithmic framework and shown that it is effective

    in solving a wide variety of image-processing and computer-vision prob-lems. We demonstrate the application of our algorithmic framework tosolve the problem of simultaneous anomaly detection, segmentation,and object profile estimation for noisy and speckled laser radar rangeimages. 2001 Society of Photo-Optical Instrumentation Engineers.[DOI: 10.1117/1.1385168]

    Subject terms: EM algorithm; mean-field theory; multiscale models; laser radar;image processing.

    Paper 200230 received June 8, 2000; accepted for publication Feb. 19, 2001.

    1 Introduction

    Estimation problems for 2-D random fields arise in contextsranging from image processing to remote sensing. In somespecial casesmost notably spatially stationary statisticsover a rectangular grid with regularly spacedmeasurementsvery efficient algorithms based on the fastFourier transform FFT can be used to obtain optimal es-timates and error statistics. However, when one strays fromthese cases, the complexity of optimal solutions for manypopular classes of stochastic models can become prohibi-tive with computational loads that do not scale well withproblem size.

    One approach that overcomes many of these problemsinvolves the use of multiscale stochastic models1,2 and as-sociated multiscale estimation algorithms. These algorithms

    do not require stationarity, regularly shaped domains, orregularly spaced measurements. Moreover, these modelshave been shown to capture a rich class of random-fieldstatistical behavior, making them useful for a number ofimportant applications.

    In this paper we extend the applicability of this multi-scale framework by marrying it with two other computa-tionally powerful techniques, namely the expectation-maximization EM algorithm and mean-field theoryMFT. The result is a methodology that can be applied toa greatly expanded range of applications, including texturesegmentation, simultaneous classification and gain correc-tion of magnetic resonance imaging MRI brain scans,3

    and a relaxed version of the Mumford-Shah variational ap-

    proach to image segmentation. A number of these applica-

    tions are described in some detail in Ref. 4, and Fig. 1depicts an example of the application to a texture segmen-

    tation problem.In particular, as described in Ref. 5, each of the indi-

    vidual textures shown in Fig. 1 can be well modeled with

    multiscale models, allowing us to apply efficient multiscale

    estimation procedures to problems such as restoring noise-

    corrupted versions of any one of these textures. However,the multiresolution algorithm by itself cannot solve the

    problem of restoring the image shown in Fig. 1, as it con-

    sists of several different textures, nor can it directly solve

    the problem of segmentation of such an image. However,

    by employing the EM concept of hidden data in this case,

    the segmentation map indicating which of the two woodtextures is visible in each pixel, we can indeed employmultiscale estimation as a core component of an algorithm

    that produced the results restoration, segmentation, and as-sociated error variances shown in Fig. 1.

    Figure 2 depicts the conceptual structure of our method-ology. Viewing the EM algorithm as the central structure ofan iterative procedure, we use multiscale estimation tosolve the so-called M step of the procedure and MFT toprovide an efficient approximation to the E step. In the nextsection we briefly describe the background topics that formthe components of the triad in Fig. 2. In Secs. 3 and 4 wethen develop our algorithm in the context of an important

    1287Opt. Eng. 40(7) 12871301 (July 2001) 0091-3286/2001/$15.00 2001 Society of Photo-Optical Instrumentation Engineers

  • 8/6/2019 Ref 20

    2/15

    application, namely laser radar ladar image processing,target segmentation, and range profiling. We have chosenthis application for several reasons. First, as we will see,the employment of our framework in this context involvesmore than just segmentation, and, as a result, using it as avehicle illustrates both the breadth of problems to which itcan be applied and how it is applied. Secondly, this particu-lar application is of significant current interest, and, indeed,our results represent a substantial extension of previouswork on ladar anomaly rejection and background rangeplane profiling. Following the development of our ap-proach, we present results in Sec. 5 demonstrating its effi-cacy in simultaneously rejecting anomalies, profiling non-planar backgrounds, detecting and segmenting targets, andproviding range profiles of those targets. The paper thencloses with a brief discussion and conclusion.

    2 Background

    As we have indicated, our framework involves the synthe-sis of three distinct methodologies for statistical inference,and in this section we provide concise summaries of eachof these. More complete description of these topics can befound in some of the references e.g., Refs. 1,2, and 58.

    2.1 The EM Procedure

    The EM procedure is a powerful iterative technique suitedfor calculating the maximum-likelihood estimates MLEs

    in problems where the observation can be viewed as incom-

    plete data. The MLE of X, denoted as XML , based on theincomplete observed data Y, is defined as

    XMLarg maxX

    logpYX, 1

    where logp(YX) is the log likelihood of Y given X. Inmany applications, calculating the MLE is difficult becausethe log-likelihood function is highly nonlinear and not eas-ily maximized. To overcome these difficulties, the EM al-gorithm introduces an auxiliary function Q along withsome auxiliary random variables that has the same behav-ior as the log-likelihood function in that when the log-likelihood function increases, so does the auxiliary func-tion but is much easier to maximize.

    Central to the EM method is the judicious introductionof an auxiliary random quantity W with log likelihood

    logp(WX). The data W is referred to as the complete databecause it is more informative than Y. The complete data Wis not observed directly, but indirectly through Y via the

    relationship YG(W), where G is a many-to-one mappingfunction. Those unobserved variables are referred to in theEM formulation as the hidden data and denoted by H. The

    EM approach calculates XML through an iterative procedure

    in which the next iterations estimate of X is chosen to

    maximize the expectation of logp(WX) given the incom-plete data Y and the current iterations estimate of X. Theiteration consists of two steps:

    The E step computes the auxiliary function

    QXX[k]logpWXY,X[k], 2

    where represents expectation, and X[k] is the esti-

    mate of X from the kth iteration. Often, this step re-

    duces to calculating the expected value of H given Y

    and X[k].

    The M step finds X[k1] such that

    X[k1]arg maxX

    QXX[k].

    The maximization here is with respect to X, the firstargument of the function Q. Intuitively, the M step isdesigned to use the expected value of the hidden dataH found in the E step as if it were measured data in

    order to obtain the ML estimate of X. The EM algo-rithm can easily be extended to a maximum a poste-riori MAP estimator by imposing a prior model,p(X), on the estimated quantity X during this step.

    With this modification, the M step finds the X[k1]

    such that

    X[k1]arg maxX

    QXX[k]logpX. 3

    Since the EM algorithm is iterative, initial conditionsmust be given to start it. As it is guaranteed to convergeonly to a local maximum of the likelihood function, choos-

    Fig. 1 Motivational example: texture image segmentation and esti-mation.

    Fig. 2 Conceptual representation of the algorithmic framework.

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1288 Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    3/15

    ing initial conditions requires some care. The E and the Msteps are evaluated repeatedly until convergence, typically

    specified as when X[k1]X[ k] for some small0.

    2.2 Mean-Field Theory

    MFT, which provides a computationally efficient procedure

    for approximate calculation of expectations of large Mar-kov random fields MRFs, has its origins in statistical me-chanics and is concerned with finding the mean field energy

    of an MRF. In particular, let L denote a two-dimensional

    rectangular lattice. Let uu i ,iL be a MRF taking onthe well-known Gibbs distribution see, e.g., Ref. 8 givenby

    puexpUu

    .

    The constant 0 is used to adjust the strength of the

    energy function U(u) defined as

    Uuclq

    Vclqu

    where the Vclq(u) are the clique potentials used to capture

    the interactions between pixels. The normalization constant

    is defined as

    u

    expU u .

    For simplicity of exposition, pixel interactions are assumedto be at most pairwise. The energy function can now bewritten as

    UuiL

    Vclqu i12 lNi Vclqu i ,u l , 4where the singleton clique Vclq() and the doubleton clique

    Vclq( ,) are the clique potentials for a single site and a

    pair of neighboring sites horizontal or vertical, respec-tively. Finally, as a result of our simplifying assumption,

    Ni denotes the set of first-order neighbors of pixel i.MFT is used to compute an approximation to the mean

    of the field u, i.e., to find

    u i

    u uipu

    u i uiL\i pu

    u i

    u iL\ i expUu

    5

    for each iL where L\i denotes the set of all pixels ex-

    cluding i). It is clear from 5 that a large number of con-figurations associated with the energy function need to beevaluated in order to calculate this expectation, thus mak-ing its precise calculation both complex and computation-ally impractical. To overcome this, MFT approximates the

    weight on each possible value ofu i in the rightmost expres-

    sion of Eq. 5. In particular, for each site i and each pos-sible value of u i , we approximate this weight by replacing

    the values of each of the uj , ji, by its mean value. Thatis, we define the mean field energy at site i as

    Uimfu iUuujuj, jiUi

    mfu iR imfuLi, 6

    where Uimf(u i) includes all the clique potentials involving

    the value at site i and is given by

    Uimfu iVclqu i

    lNi

    Vclqu i ,u l, 7

    and R imf(uLi) consists of terms involving only means at

    sites other than i. Further we use the mean-field approxi-mation for the mean at each site:

    ujuj

    ujexpUj

    mf

    uj

    jmf

    ,

    8

    jmf

    uj

    expUjmf uj .

    The MFT formulation supposes that for a particular

    pixel iL, the influence of the energy field from otherpixels can be approximated by that of their statisticalmeans. Random fluctuations from the other pixels are ne-

    glected. Note that since Uimf(u i) depends on the means

    uj, ji , the approximations to u i in Eq. 8 depends

    on the means at other pixels. Thus the set of equationscorresponding to Eq. 8 for all iL represents a set of

    simultaneous algebraic equations for all of the u i. Itera-tive algorithms can then be used to find the solution to thisset of equations.

    2.3 Multiscale Stochastic Models and OptimalEstimation

    In this subsection, we briefly review the multiscale statisti-cal framework for modeling and estimating 2-D randomfields.1,2,5 Multiscale stochastic models for image process-ing are defined on a pyramidal data structure such as thequadtree shown in Fig. 3. Each level of the tree corresponds

    Fig. 3 An example of a multiscale quadtree.

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1289Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    4/15

    to a different scale of resolution in the representation of therandom field of interest. Define the index s to specify the

    nodes on the tree, s to represent the parent of node s, and

    si to denote the q offspring of node s with i1,. . . ,q

    (q4 for a quadtree. The scale of node s is denoted as

    m(s) and is numbered sequentially from coarse to fine

    resolution. A multiscale model with state X(s)Rn is

    specified by a recursion of the form

    Xs As XsBs Ws 9

    with X(0 )N(0,P o), W(s)N(0,I), and W(s)Rm.

    The notation N(,) represents a Gaussian random vec-

    tor x with mean and variance . The matrices A(s) and

    B(s) are the parameters of the tree model, appropriatelychosen to represent the random field of interest. The root

    node of the tree X(0) with prior covariance P o provides theinitial condition for starting the coarse-to-fine recursion.

    The driving white noise W(s) is independent of the initial

    condition X(0). The states at a given scale can be thoughtof as information about the process at that level of the tree.

    As X(s) evolves from coarse to fine scale, the term

    A(s)X(s) predicts the states of the process for the next-

    finer scale while the term B(s)W(s) adds new details tothese states.

    The measurement model associated with this multiscalestochastic model is

    Ys Cs Xs Vs 10

    with V(s)N(0,R(s)), R(s) the covariance ofV(s), and

    C(s) the measurement matrix used to describe the nature of

    the measurement process. The general multiscale estima-tion framework allows easy and direct fusion of measure-ments at all scales see, for example, Ref. 9. For the ap-plication on which we focus here, all measurements will bescalar observations at the finest scale, representing pointmeasurements of the random field of interest.

    The algorithm for optimal multiscale estimation consistsof two steps: a fine-to-coarse sweep which resembles thetime-series Kalman filtering algorithm followed by acoarse-to-fine sweep which is analogous to the Rauch-Tung-Striebel smoothing algorithm. Among the most im-portant properties of this algorithm is that it produces bothoptimal estimates and the covariance of the errors in these

    estimates with total complexity that is O(k3n), where n is

    the number of fine-scale pixels and k is the dimension ofthe state X(s). Thus the computation load for this algo-rithm grows linearly with problem size. This should be con-trasted with the complexity of other estimation approachese.g., those based on MRF models, which have complexitythat is O(n) for values of 1 for calculation of theestimates, and typically even greater complexity often pro-hibitively high for the computation of error covariances.Of course, the utility of the multiscale approach also de-pends on having models with modest state dimension k.Demonstration that such models do exist for a wide varietyof random fields can be found in Refs. 5,10,11. Of specificrelevance to this paper are the multiscale models developed

    in Ref. 10 corresponding to so-called thin-plate and thin-membrane random surfaces. A brief summary of thesemodels is presented in Sec. 7.1.

    3 Probabilistic Models for Laser Radar RangeProfiling and Segmentation

    As illustrated in Fig. 4, and as described in detail in Refs.12,13, raster-scanned and range-resolved ladar imagingprovides the capability for producing 3-D images ofground-based objects from airborne platforms using adown-looking geometry. Utilization of such images e.g.,for automatic target recognition ATR involves severalbasic functions. First, of course, is dealing with the uncer-tainties and sources of error in the ladar measurements. Asdescribed in Ref. 12, these errors consist of relatively smallperturbations due to noise and occasional large anomaliesdue to deep speckle fades. Since the frequency of anoma-lies can be significant, taking account of them is a seriousissue. Second, accurate localization and profiling of anytarget present in a ladar range image requires estimation of

    the unknown and spatially varying range of the ground.Detection and profiling of the target then requires segment-ing out the region over which the range measurementsstand out from the background, indicating the presence ofthe target.

    As a first-generation preprocessor for an ATR system,Green and Shapiro12 developed an algorithm that estimatesthe background range profile for a ladar range image cor-rupted by range anomalies. Only the problem of estimatingthe background range profile is addressed in their algo-rithm. The idea is that pixels that do not belong to thebackground are grouped together as either range anomaliesor target range measurements. Their algorithm is based onan idealized single-pixel statistical model for the ladar

    range measurements together with the assumption that thebackground range profile is a plane parameterized by theelevation and azimuth range slopes and by the range inter-cept. The EM procedure is employed in Green and Sha-piros algorithm. The E step localizes and suppresses therange anomalies; the M step finds the ML estimate of thethree parameters that characterize the planar backgroundrange profile. Simulation studies have confirmed that Greenand Shapiros technique is computationally simple withgood range anomaly suppression capabilities.

    The application of our EMmultiscale-MFT estimationframework to the ladar range imaging problem extendsGreen and Shapiros work in several fundamental and prac-

    Fig. 4 Down-looking geometry for collecting 3-D images of ground-based scenes.

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1290 Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    5/15

    tically significant ways. First, rather than modeling thebackground surface as planar, we allow smooth spatialvariation of the background, using a multiresolution surfacemodel analogous to the so-called thin-membrane surfacemodel see Sec. 7.1. Employing this multiresolution modelallows us to use the very fast estimation procedure de-scribed in the previous section. In addition to estimating thebackground range profile, the approach we describe simul-

    taneously performs two other significant functions, namelytarget segmentation and target range profile estimation. Toaccomplish this, we use a second multiscale model, in thiscase one corresponding to a so-called thin-plate prior, tomodel the target range profile within the image, togetherwith several important sets of hidden variables. The firsttwo of these sets, namely, the sets of pixels that correspondto anomalous background and target range measurements,are analogous to those used in Ref. 12. The other hiddenvariable, namely, the set of pixels segmenting the targetand background regions and explicitly identifying the lineof pixels corresponding to ground attachment i.e., wherethe target meets the ground, allows us to perform targetsegmentation and profiling.

    A uniformly spaced raster scan is used to collect a rangeimage consisting of n pixels. We denote the 2-D lattice

    consisting of these n pixels as L. Let yy i , iL denotethe measured ladar range values, whose background range

    profile is denoted as xbgx bgi ,iL and whose target

    range profile is denoted as x tgx tgi ,iL. Note that it is

    convenient with no approximation to think of both thebackground and the target as being defined over the entireimage domain. The background obviously is, although it isoccluded by the target over the unknown region in whichthe target resides. Thus, while the algorithm we describeproduces estimation of background and target fields overthe entire image domain, the target field estimate is only

    meaningful over the estimated target region, while over thatsame region the background estimate can be thought of asan optimal interpolation of the occluded portion of the

    background. For notational simplicity, let y, xbg , and x tgdenote lexicographically ordered vectors of the measure-ment, background, and target fields, respectively. We nextpresent the several elements of our models for these quan-tities, including the specification of the hidden variablesemployed in our EM-based procedure.

    3.1 Prior Models for xbg and xtg

    The random fields x tg and xbg are modeled as independent,

    zero-mean Gaussian random fields to represent the smooth-

    ness constraints on the random fields. Specifically,x

    bg ismodeled as the finest-scale field in a multiscale thin-

    membrane model, and x tg as the finest-scale field in a mul-tiscale thin-plate model see Sec. 7.1 and Ref. 10.

    In principle, then, the prior model for xbg has the form

    p(xbg)G(xbg ;0,bg), where

    Gx;m ,2 n/2 1/2

    exp12 xm

    T1xm ,

    and where xbg is the nn covariance matrix for the back-

    ground range profile x bg . Likewise, the prior model for the

    target range profile x tg is p(x tg)G(x tg ;0, tg), where x tgis the nn covariance matrix for the target range profile

    x tg . While we never need to explicitly calculate xbg and

    x tg or their inverses, it is convenient for our discussion to

    introduce these explicit models. Also for ease of notation,the two range profiles are stacked together to form the

    range profile vector X, i.e., Xx bg x tgT. We consider the

    two range profiles to be statistically independent of oneanother, so that the prior model for X becomes

    pXGX;0,X, Xxbg 00 x tg

    . 113.2 Definition of the Hidden Variables

    The hidden variables we introduce partition the observed

    image domain L into five regions. A convenient way inwhich to express this partitioning is through the introduc-

    tion of five sets of indicator functions taking only values 0

    and 1) over L, one for each region type. The simplest of

    these are the two sets of variables hatg

    hatgi ,iL

    andh abgh abgi ,iL, indicating the presence of range

    anomalies over the target and background portions of theimage, respectively. The reason for separating theseanomalies into two fields is simple: we want to interpolatethe target background range profile for anomalous pixelswithin the target background region.

    The next two sets of hidden variables, h bghbgi ,i

    L and h tgh tgi ,iL, designate those portions of the

    observed range measurements that are unambiguouslymeasurements of the background or target, respectively.The final set of hidden variables indicates the line of pixelsalong which the target and background profiles intersect

    i.e., the ground attachment line of the target, specified byh gah gai , iL. Together, the five sets of hidden vari-

    ables indicate mutually exclusive sets of pixels partitioning

    L. That is, if we define

    Hih bgih tgi

    h gaihabgi

    h atgiT iL, 12

    then each Hi takes on only the values ek , k1 ,. . . ,5 ,

    where ek is a binary column vector with 1 in the kth com-

    ponent and 0 everywhere else. As an example, Hie2 in-

    dicates pixel i as belonging to image class 2, i.e., the set ofpixels corresponding unambiguously to target range mea-surements.

    3.3 Measurement and Pseudomeasurement Models

    The observed ladar image data are noise- and anomaly-corrupted measurements of either the target or background.The explicit dependence of these measurements on Xand His given by the following measurement model:

    y i h bgih gai

    2h tgi

    hgai

    2x bgi

    x tgi

    h tgih bgi

    hgaih abgi

    h atgi vRivR i

    , 13

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1291Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    6/15

    where we have adopted some of the notation from Refs.12,13 to describe the noise and anomalous measurement

    distributions. In particular, vR N( 0 , R2 I) represents

    the noise in a nonanomalous range measurement, with R

    corresponding to the range resolution of the ladar system,

    while vRU(R minR/2, (R)2/12I) corresponds to

    the uncertainty in an anomalous measurement, modeled asuniformly distributed over the full unambiguous range in-

    terval of the ladar, i.e., RR maxRmin . The notation

    U(,) represents a uniform random vector with mean

    and variance . For the results here, we use the sameladar system parameters and statistics as in Ref. 12,

    namely, R2 range bins, R max1524 range bins, and

    R min0 range bins, so that R1524 range bins. Eachrange bin translates to a physical distance of 1.1 m, withlower bin numbers indicating closer range.

    Intuitively, the first term on the right of Eq. 13 deter-mines the type of each range measurement y i , while the

    second term indicates the type of noise associated witheach range measurement. It is instructive to examine sev-eral specific examples to understand how the hidden vari-

    ables influence the model of the observed data. For ex-ample, suppose that pixel i corresponds to a background

    measurement so that h bgi1 and all other hidden variables

    are zero. Then

    y i1 0x bgix tgi

    1 0 vR ivRi

    xbgivRi.This equation says that y i is equal to the background range

    profile xbgiplus some measurement noise, which we model

    as additive Gaussian noise. Suppose, as another example,pixel i corresponds to a range anomaly lying within the

    target region (hatgi1). Equation 13 then gives

    y i0 0x bgix tgi

    0 1 vR ivRi

    vR i,which is consistent with our modeling of a range anomaly.As a final example, suppose pixel i corresponds to a ground

    attachment measurement (h gai1). This means that x bgi

    and x tgiare implicitly equal, and from Eq. 13 we have

    y ix bgi

    x tgi

    2vR

    i

    ,

    which binds the ground attachment measurement to both

    x bgiand x tgi

    .

    Note that when hgai1, we would like to ensure that x bgi

    and x tgiare approximately equal since the target attaches

    to the ground at these points. There are two equivalentways in which to accomplish this. The first is as part of the

    prior model for xbg and x tg , providing a nonlinear coupling

    between these two fields. The second approach, whichleads to a far simpler estimation algorithm, involves incor-porating this constraint as an additional set of pseudomea-

    surements: For those pixels for which hgai1, we observe a

    measurement value of 0 for the difference xbgix tgi

    , where

    the additive noise on this measurement has very small vari-

    ance 2. That is, at each pixel we have the additional mea-

    surement

    0hgai h gaix bgi

    x tgi hgai

    vi iL, 14

    where v N(0,2 I). In this setup, the parameter con-

    trols the tightness of the coupling between the two MRFs.

    Note that if hgai0 this equation correctly indicates that we

    are not specifying any additional constraint on xbgix tgi

    .

    However, if hgai1, this pseudomeasurement introduces a

    strong coupling between the estimates of x bgiand x tgi

    .

    Putting this all together, at each pixel i we now have a2-vector of measurements

    Yiy i

    0 CHiXiFHiVi 15with Vi

    TvRi vi vR i

    , and

    CHi hbgihgai

    2h tgi

    hgai

    2

    hgaih gai

    ,FHi h bgih tgihgai 0 habgihatgi

    0 h gai 0 .

    Equation 15, then, gives us the second needed part of ourprobabilistic model, namely,

    pYX,HiL

    pYiXi ,Hi

    iL

    HiT

    Gy i ;x bgi,R2

    Gy i ;x tgi,R2

    G y i ; x bgix tgi2 ,R2 Gxbgi;x tgi,21/R

    1/R

    .16

    We ignore the digitization that is found in typical ladarsystems, and treat Y as a continuous random variable. Therole of Hi is simply to select one of the five densities indi-

    cated on the right-hand side of Eq. 16.

    3.4 Probabilistic Model for the Hidden Variables

    The remaining part of the probabilistic model is that for thehidden variables, H. In particular, as we see in the nextsection, the quantity that is needed for the EM procedure is

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1292 Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    7/15

    p(HX). In our model we make the simplifying assumptionthat H and X are statistically independent, so

    pHXpH. 17

    Note that for the variables h atg and h abg modeling the pres-

    ence of ladar anomalies, this independence assumption iscompletely justified. However, for the others it represents a

    clear simplification. For example the condition h gai1

    ground attachment implies x bgix tgi, indicating a very

    strong dependence between H and X. However, as we havealready indicated, we can capture this dependence throughthe pseudomeasurement 14 rather than through the prior.Note that there also is a dependence between h bgi

    ,h tgi

    indicating observation of background or target in pixel i)

    and x bgi,x tgi

    : In particular, if the target is present in pixel

    i (h tgi1) , then obviously the target occludes the back-

    ground, implying that x tgixbgi

    . While we could introduce

    another pseudomeasurement to capture this relationship as

    well, for simplicity we have not done so, and thus ourmodel indeed neglects this dependence. In principle thisimplies that the resulting estimates we produce could have

    target and background estimates that are inconsistent ( htgi

    1 but xtgixbgi

    ), but, as we illustrate using real ladar

    data, employing this simplification does not in fact intro-duce such inconsistencies.

    We use a Gibbs prior to capture spatial dependences inthe hidden variables:

    pH1 expUH,

    where is the normalization constant. Defining the energy

    function U(H) requires specifying the neighborhood struc-ture on the 2-D lattice L. For our purposes it is sufficient touse first-order neighborhoods,14 implying that cliques con-sist only of individual pixels and pairs of pixels that areeither horizontal or vertical nearest neighbors. This implies

    that U(H) can be written as

    UHiL

    VclqHi 12 lNi HiTWilHl , 18

    where Wil is a 55 matrix whose ( s,t)th component is

    Vclq(Hies ,Hle t). The elements within each matrix Wilare chosen to encourage discourage certain types of inter-

    actions among the hidden variables by decreasing increas-ing the energy function U(H). The specific choices forthese matrices that we have used are as follows:

    Wil0.5 0.5 0 0 0

    0.5 0.5 0.5 0 0

    0 0 0.5 0 0

    0.5 0.5 0 0 0

    0.5 0.5 0 0 0

    , 19awhen pixel l is above pixel i,

    Wil0.5 0.5 0 0 0

    0.5 0.5 0 0 0

    0 0.5 0.5 0 0

    0.5 0.5 0 0 0

    0.5 0.5 0 0 0

    , 19bwhen pixel l is below pixel i, and

    Wil0.5 0.5 0 0 0

    0.5 0.5 0 0 0

    0 0 0.5 0 0

    0.5 0.5 0 0 0

    0.5 0.5 0 0 0

    , 19cwhen pixel l is to the left or right of pixel i. In each of thesematrices, the entries in the first row are chosen to encour-age continuity of the background region i.e., if a majorityof the neighbors of pixel i are labeled as background, then

    pixel i is more likely to be background as well. The entriesin the second row are chosen to encourage continuity of thetarget region and, in the case of Eq. 19a, also to discour-age target pixels from appearing below the ground attach-ment pixels. The entries in the third row are chosen toencourage the ground attachment pixels to form a thin linebelow the target pixels. The entries in the fourth and fifthrows are chosen to help differentiate between an anomalouspixel located in the background region and an anomalouspixel located in the target region.

    In order to incorporate p(Hi), the single-pixel stationary

    prior probability density functions pdfs for Hi , into our

    Gibbs modeling of p(H), we assign the singleton clique

    potential Vclq(

    ) to be

    pHiZi1 expVclqHi,

    20

    ZiHi

    expVclqHi .

    Based on the work in Ref. 12, the probability of the pres-

    ence of range measurement anomalies is 0.05, and we dis-

    tribute this prior probability equally between p(Hie 4)

    and p(Hie5), the probabilities that a pixel measurement

    is an anomaly situated inside the background region andthat it is an anomaly situated inside the target region, re-

    spectively. The prior probability that a particular pixel inthe image is a ground attachment pixel is chosen to be 0.05.

    The remaining prior probability for Hi (0.9) is distributed

    equally between p(Hie 1) and p(Hie2), the probabili-

    ties that a pixel corresponds to the background region andto the target region, respectively.* To summarize:

    *Note that the values we have chosen for p(e1), p (e 2), and p (e 3) reflectrelative values of pixel categories for ladar images in which we expecttargets to fill roughly half of the image area. If targets take up signifi-cantly less image area, one can capture this by biasing the prior in favor

    of e1 . However, as the examples in Sec. 5 indicate, the results are notparticularly sensitive to these values.

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1293Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    8/15

    pHi0.45 0.45 0.05 0.025 0.025Hi .

    Our full prior model for H then is

    pH1iL

    pHiZi exp 2 lNi HiTWilHl . 21

    4 The Profile Estimation and SegmentationAlgorithm

    The problem to be solved is the estimation of X given Y

    based on the prior model p(X) given by Eq. 11, the mea-

    surement model p(YX,H) given by Eq. 16, and the priormodel for the hidden variables p(HX)p(H) given byEq. 21. In this section we describe the two steps in theEM procedure required to compute a local maximum of

    logp(XY), which we take as our estimate. This procedurealso produces an estimate of H that is of direct use in seg-menting out the target, rejecting and compensating foranomalies, etc.

    4.1 The E StepThe quantity to be calculated in the E step is

    QXX[ k]logpY,HXY,X[ k],

    which is a conditional expectation over H given Y and the

    current estimate X[k] of the range profile. Using Bayess

    rule, Q can be rewritten as

    QXX[ k]logpYX,HY,X[ k]logpHY,X[k].22

    Since the M step is a maximization of Q(XX[k]) over X,we can discard the second term in Eq. 22, since it doesnot depend on X. Using the same notation for the result, wehave that

    QXX[k]iL

    HiY,X[k]T

    log Gy i ;x bg

    i,R 2

    log Gy i ;x tgi,R2

    log G y i ; x bgix tgi2 ,R 2log Gxbgi ;x tgi,2logR

    logR

    .23

    Here we have also used the fact that each Hi is an indica-

    tor vector in order to take the log of p(YX,H) in Eq. 16.Thus the core of the E step is the computation of

    HiY,X[ k]. Using the formula for expectations, Bayess

    rule, and Eqs. 16, 17, and 21, we find that

    HiY,X[ k]1

    HHi

    iLpYiXi[k] ,HipHi

    exp 2 lNi HiTWilHl , 24

    where

    H

    iL

    pYiXi[k] ,HipHi

    exp 2 lNi HiTWilHl .

    From this, it is obvious that a large number of configura-tions of H need to be evaluated in order to compute Eq.24. This makes its precise calculation both complex andcomputationally impractical. By applying MFT, this diffi-

    culty is overcome, and an approximation to HiY,X[k] is

    obtained by evaluating the following:

    HiY,X[ k] i

    mf1Hi

    HipYiXi[k] ,HipHi

    explNi

    HiTWilHlY,X

    [k] , 25where

    imf

    Hi

    pYiXI[ k] , cipHi

    explNi

    HiTWilHlY,X

    [k] .As Eq. 25 indicates, to find the mean field HY,X[ k] ati, one needs the mean field HY,X[ k] at the neighbors ofi. Thus, Eq. 25 for all iL represents a set of simulta-neous algebraic equations, which can be solved by embed-

    ding it within the E step of the overall EM procedure. Inprinciple we can perform multiple Jacobi iterations to solvethese equations within each E step. However, obtaining an

    accurate estimate of HY,X[k] in each E step may not benecessary, since HY,X[k] is merely used to guide theestimate of X in each M step. Empirically, we have foundthat it is sufficient and, moreover, computationally efficientto approximate Eq. 25 by performing only one Jacobiiteration in each E step. The HY,X[k1] calculated fromthe previous EM cycle serves as the initial guess for thatsingle Jacobis iteration. This approximation scheme has

    the natural notion of refining the estimate of HY,X[k] asthe number of EM cycles increases because successive es-timates of HY,X[k] used for the Jacobi iterate in the Estep are more accurate. Also, since the calculation ofHY,X[k] can be decomposed into local computations, itssolution can be implemented in a parallel fashion to yield afast and simple approximation.

    4.2 The M Step

    The M step calculates the X[k1] in Eq. 3. It is straight-forward to check that the quantity to be maximized in Eq.3 is quadratic in X and that the maximum is given by

    X1M X[ k1]Py , 26

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1294 Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    9/15

    where X is the block-diagonal prior covariance matrix in

    Eq. 11, and M and P are given by

    MD 1 D 3D3 D 2

    , PD 4D 5

    , 27where D 1 , . . . , D 5 are diagonal matrices given by

    D 1 iih bgiY,X

    [ k]h gaiY,X[k]/4

    R 2

    h gaiY,X[k]

    2

    iL, 28a

    D 2 iih tgiY,X

    [k]h gaiY,X[k]/4

    R 2

    h gaiY,X[k]

    2iL, 28b

    D 3 iih gaiY,X[k] 1

    4 R 2

    1

    2 iL, 28c

    D 4 iih bgiY,X

    [ k]h gaiY,X[k]/2

    R 2iL, 28d

    D 5 iih tgiY,X

    [k]h gaiY,X[k]/2

    R 2iL. 28e

    The quantity X

    [k1]

    can be viewed as the MAP estimateof X based on the prior model specified by p(X) and a

    measurement model p(dataX) specified implicitly byQ(XX[ k]). More precisely, logp(dataX)Q(XX[k])terms not depending on X. The prior on X is reflected in

    the block-diagonal matrix X1 in Eq. 26, while M and P

    reflect the measurement model. Specifically, by collectingmeasurements y in Eq. 13 and pseudomeasurements allwith value 0 in Eq. 14 into vectors, we can write

    y0CHXFHV, 29

    where V is a vector containing all of the noise sources vR i,vRi

    , viat all pixels with covariance matrix R. What the E

    step in essence does is to replace C(H) and F(H) by ex-

    pected values or approximations of these using MFT,yielding an effective measurement model of the form

    y0C[k]XF[k]V. 30

    Using this notation,

    MC[k]TF[ k]RF[ k]

    T1C[k] , 31a

    PyC[k]TF[k]RF[k]

    T1 y

    0 , 31b

    so that Eq. 26 is precisely the set of linear equations forthe optimal estimate of X based on the measurement model30. Moreover, the corresponding estimation error covari-ance is given by

    eX1M1. 32

    While the prior models for target and background areindependent as reflected in the block-diagonal structure ofX), their estimates are coupled. For example, the E step

    replaces the various hs in Eq. 13 by their expectations.Thus while each pixel measurement is either target, back-ground, or anomaly, the model 30 will in general charac-terize each measurement as a weighted average of these

    possibilities, using the expected values of the various hs ascalculated in the E step. However, this coupling occurs in a

    pixel-by-pixel fashion, as reflected in the fact that D 1 , . . . ,

    D 5 are diagonal. Indeed, by reordering the elements of Xand y, we can view our problem as that of estimating avector random field consisting of the 2-D vector of targetand background at each pixel based on pixelwise measure-ments that are independent from pixel to pixel.

    Finally, it is important to realize that despite all of thisstructure, Eq. 26 represents a huge set of equations to besolved (2n 2 for an nn image, and the error covariance32 is of the same very large size. However, by using themultiresolution thin-membrane and -plate priors mentionedpreviously and summarized in Sec. 7.1, the solution of Eq.

    26 and the computation of the diagonal elements of ei.e., the pixelwise error covariances can all be accom-plished with total per-pixel complexity that is independentof image domain size. This makes the EM procedure notonly feasible but, in fact, quite efficient.

    5 Experimental Results

    In this section, we present two examples to illustrate theperformance of the algorithm we have just described. Theladar range images we use are provided by MIT LincolnLaboratory. These range images are formed by an airborne

    coherent CO2-ladar system operating in pulsed-imager

    mode. The images are 43126-pixel gray-scale range im-

    ages.In our first example, the range image is of a C50 tank

    positioned against a sloping featureless terrain as shown inFig. 5a. This range image was taken under conditions ofhigh carrier-to-noise ratio CNR, so that it has relativelylittle noise, with almost no anomalies present. We use thisimage as the ground truth to test the performance our algo-rithm. To simulate realistic range images that resemble theones obtained under battlefield conditions i.e., low-CNRconditions, a local-range-accuracy Gaussian noise with astandard deviation of R2 bins is artificially added toeach pixel in the image. In addition, a Bernoulli process is

    used to randomly select 5% of the pixels in the image as

    anomalous pixels. For each such anomalous pixel, a ran-dom range value, chosen from a uniform distribution over

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1295Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    10/15

    the range uncertainty interval R1524 bins, is assigned.The resulting observed ladar range image, shown in Fig.5b, very much resembles the ones obtained on the battle-field. The parameters used to obtain this simulated ladarimage came from Ref. 12.

    One of the goals of our algorithm is to estimate thebackground and the target range profiles, i.e., the rangeprofiles of the sloping featureless terrain and the tank sil-houette, based on the observed image. Figure 6 showsour estimate of the range profiles and their associated errorstatistics. The error statistics provide us with a useful mea-sure of the uncertainty of our estimate of the range profiles.From top to bottom, this figure shows the estimate of the

    background range profile, its estimation error variances, theestimate of the target range profile, and its estimation errorvariances. The estimate of the background range profile inFig. 6a demonstrates good anomaly suppression capabili-ties. The estimated profile is smooth and free of the obviousblocky artifacts sometimes associated with the use of amultiscale estimator.5,10 This is directly attributable to theuse of the overlapping tree surface models developed inRef. 10 and summarized in Sec. 7.1. More interestingly, theestimate of the background is obtained over the entire fieldof view, including parts of the background range profilethat are obscured from the field of view by the tank and therange anomalies. Even without background range measure-ments at these obscured locations, the multiscale estimator

    still does capture the coarse features of the smooth back-ground profile there. The lack of background range mea-surements at the locations of the tank is consistent with theinformation conveyed in Fig. 6b, which shows high esti-mation uncertainties around the region obscured by thetank. Figure 6c shows the estimates of the tanks rangeprofile. Note that the estimate of the target range profile is

    relatively flat, which is consistent with our thin-plate mod-eling. Figure 6d shows the estimation error variances ofthe target range profile at locations where the tank appears.

    The other goal of our algorithm is to provide a segmen-tation map of the ladar range image. By thresholding H, weobtain the segmentation map shown in Fig. 7. Figure 7ashows the locations of the target and the ground attachment

    pixels, while Fig. 7b shows the locations of the target andbackground range anomaly pixels. It is interesting to notethat as designed, the ground attachment class constitutes athin line corresponding to the line along which the targetattaches to the background. Taken together, the target pix-els and the target range anomaly pixels form the locationmap of the target of interest, which in this case is the tank.More importantly, the location map of the tank with itsaccompanying range profile can yield 3-D informationabout the target and thus plays an important role in ATR.

    Using the estimate of the two range profiles and theinformation provided by the segmentation map, we can re-construct an anomaly-free and denoised image from the

    Note that in contrast to the estimates in Fig. 6a, the error variances inFig. 6b do display some of the blocky artifacts associated with multi-scale estimates, although even these are greatly reduced by using over-lapped models. In fact, as discussed in Ref. 10, they can be reducedfurther by increasing the so-called degree of overlap, which however alsoincreases the computational complexity.

    Fig. 5 Ladar range image of tank: (a) ground truth image, (b) ob-served image.

    Fig. 6 Multiscale tank ladar range profile estimates: (a) estimate ofbackground range profile, (b) estimation error variances of back-ground range profile, (c) estimate of target range profile, (d) estima-tion error variances of target range profile.

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1296 Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    11/15

    raw noisy data in Fig. 5b. This reconstructed ladar rangeimage of the tank is shown in Fig. 8. The reconstructedimage is remarkably similar to the ground-truth ladar rangeimage in Fig. 5a.

    The results shown in Figs. 68 were obtained with

    1 and 0.1, after running our algorithm for 14 EM

    iterations. We chose 14 iterations as the stopping point ofour iterative scheme based on the results shown in Fig. 9.The graphs in the figure show the one-norm distance mea-

    sure of the estimates xbg and x tg between two successive

    iterations. After 14 iterations, the estimates appear to havestabilized, and it is at this point that we decided that the

    algorithm had converged. Note, however, that the knees inboth of these curves occur after only two or three iterations,suggesting that we can obtain nearly as good reconstruc-tions much more quickly if total computational load is aconcern.

    As another demonstration of the performance of our al-gorithm, we use another ladar range image, where the tar-get of interest is a truck. Figures 1014 convey the sameset of information as their counterparts in Figs. 59.

    6 Conclusion

    The ladar target segmentation and range profiling algorithmthat has been described in this paper has two components:

    estimation of the segmentation map and range profile esti-mation. Our contribution has been to combine them in an

    iterative fashion that yields a powerful new estimation pro-cedure for ladar range image enhancement. Building on theprevious work done by Green and Shapiro on planar rangeprofiling, we derived a general algorithm for backgroundrange profile estimation that is able to deal with more com-plicated background. This extension uses a multiscale thin-membrane model for the background range profile; it ismuch less restrictive than the surface planar model used byGreen and Shapiro and is more applicable for estimatingthe range profiles of natural scenes. As an added feature,

    we also developed our methodology using a thin-plate priorfor target segmentation and estimation and hence provide3-D fine range information about the target required, forexample, in shape-based ATR. The multiscale methodologyprovides a computationally efficient and statistically opti-

    Fig. 7 Segmentation results of tank ladar range image: (a) target(tg) and ground attachment (ga) locations, (b) range anomaly loca-tions inside (atg) and outside (abg) the target region.

    Fig. 8 Anomaly-free and denoised reconstruction of the tank ladarrange image.

    Fig. 9 Convergence results of tank ladar range profile estimates:(a) change in the estimate of background range profile with each

    iteration, (b) change in the estimate of target range profile with eachiteration.

    Fig. 10 Ladar range image of a truck: (a) ground truth image, (b)observed image.

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1297Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    12/15

  • 8/6/2019 Ref 20

    13/15

    Y(s) potentially defined at every node. In modeling targetand background fields as surfaces, the representation ofeither surface resides at the finest scale of the tree, as dothe real ladar range measurements and ground attach-ment pseudomeasurements described in Sec. 3. Thus therole of the hidden states and some additionalpseudomeasurements at coarser scales in the tree is toshape the fine-scale surface statistics to approximate those

    corresponding to thin-plate and thin-membrane statisticalmodels.

    7.1 Multiscale Thin-Plate and Thin-MembraneModels

    Thin-plate and -membrane models, which are frequentlyused in computer vision to assert smoothness constraints,correspond to particular variational problems. As discussedin Refs. 5, 10, and 15, these models can also be interpretedstatistically, and they correspond to prior models for sur-faces with self-similar statistics and, in particular with

    power spectral densities of the form 1/f with depending

    on whether a thin plate (4) or membrane (2) model

    is used. As discussed first in Ref. 5, such self-similar sta-tistics can also be captured in multiscale models, albeit withslight differences in the prior, but with all of the computa-tional advantages of multiscale models.

    The basic thin-plate or -membrane multiscale model in-troduced in Ref. 10 is given by the following:

    z

    p

    q

    z p

    s 1 0 0 0

    0 1 0 0

    0 0 1 0

    1 0 0 0

    z

    p

    q

    z p

    s

    Bs2m( s)/2 0 0

    0 B g2m( s)/2 0

    0 0 Bg2m(s )/2

    0 0 0 Ws , 33

    where the state at each node s consists of the height z of thesurface at the scale and resolution corresponding to s, the x

    and y components (p and q) of the gradient of the surface

    at the scale and location of nodes, and the surface heightat the parent to node s at the next coarser scale i.e.,

    z p(s)z(s)].

    The thin-membrane model corresponds to Eq. 33with B sBg to diminish the effects of the states p and qand decrease the contribution of the thin-plate penalty, tocapture the fact that this 1/f2-like model for z corresponds

    to what is in essence a self-similar random walk in scale.The multiscale counterpart of the thin-plate prior corre-

    sponds to B gB s in Eq. 33. However, as is well known,the components of the surface gradient must satisfythe so-called integrability constraints, whose strict enforce-ment increases the complexity of surface estimation consid-erably. An alternative is to relax this hard constraint andreplace it by a pseudomeasurement at every node on thetree. Referring to Fig. 15, the pseudomeasurements at the

    four nodes s i for 1i4 with the same parent s aregiven by

    02m (s i)M a i b i 2m (s i)M

    z

    p

    q

    zp

    s ivs i, 34

    where the values of a i and b i for i1,2,3,4 are given by

    a 11, a21, a 31, a 41,

    b 11, b 21, b31, b41.

    Here 2 m(s i)Mz(s i)z p(s i) represents a first-differenceapproximation to directional derivative of the surface in adirection NW, SW, NE, SE that depends on the particularvalue of i, while a ipb iq represents the approximation to

    the same directional derivative implied by the values of pand q at the resolution and location corresponding to node

    s i . Equation 34 then simply states that the differencev(s i) in these two approximations to the directional deriva-

    tive is a zero-mean white noise with variance 0.1.

    The general model 33 allows both B s and B g to benonzero, allowing us freedom in adjusting the relative im-portance of the thin-membrane and -plate smoothness pen-alties. In particular, to model the background field in our

    ladar application, we used values Bs30 and B g300,

    while for the target field we set B s4 and B g0.1.

    7.2 Overlapping Multiscale Trees

    The application of this multiscale model based on the con-ventional multiscale quadtree and the associated estimation

    algorithm1,2 results in a visually distracting blockiness inthe estimates. The characteristic blocky artifact is one ofthe limitations often associated with the multiscale ap-proach. In random fields with significant levels of regular-ity or smoothness, high correlations exist between spatiallyclose neighbors. When using a standard multiscale quadtreerepresentation such as that suggested by Fig. 3, there existnodes on the tree e.g., the two shaded squares in Fig. 3that are spatially quite close but are much more distant asmeasured by the distance along the tree from one of thesenodes to the other. As a result, prohibitively high-dimensional states are required to capture the high correla-tion between such nodes, implying that lower-dimensional

    Fig. 15 Numbering scheme of nodes in the thin-plate and thin-membrane models.

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1299Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    14/15

    multiresolution models and the estimates such models gen-erate may not adequately capture the correlation andsmoothness of the field across such boundaries.

    In Ref. 11 a method to overcome this problem was de-veloped using the concept of overlapping trees. Instead ofadhering to the convention of the multiscale quadtreeframework, where distinct nodes at a given scale of the treeare assigned disjoint portions of the image domain, the

    overlapping multiscale quadtree allows distinct nodes at agiven scale of the tree to correspond to overlapping por-tions of the image domain. As a result, any pixel that wouldfall near a major tree boundary in a standard quadtree rep-resentation actually has several replicas in an overlappedmodel, essentially placing versions of that pixel on bothsides of the tree boundary in order to capture spatial corre-lations more easily.

    There are several implications of this idea. The first isthat at the finest scale of the tree we do indeed have severaltree nodes corresponding to a single image pixel. As dis-cussed in Ref. 11, there is then a need to define two opera-tors. The first, namely estimate projection, maps the esti-mates of the states at this set of tree nodes into the estimate

    of the single pixel to which they all correspond, by taking aweighted average. The second, namely measurement pro-jection, maps each measurement of an image pixel into aset of independent measurements of each of the tree nodesto which it corresponds. This is accomplished simply byreplicating the observed measurement value at each of thetree nodes but modeling each of these tree measurements asbeing proportionately noisier, so that the total informationcontent of this set of tree measurements is the same as theactual pixel measurement. For example if a pixel x corre-

    sponds to two tree nodes x(s1) and x(s2), then the mea-

    surement yxv, where v has variance 2, is mapped

    into two measurements y 1x(s 1)v1 and y 2x(s 2)

    v2 , where

    v1 ,

    v2 are modeled as independent with vari-ance 22 and where the measurement values used are y 1

    y 2y .

    The second implication of overlapping is that the size ofregions corresponding to tree nodes decrease by less than afactor of 2 as we go from scale to scale. One consequenceof this is that overlapping trees require more scales in orderto reach the same finest partitioning of the image. Thisobviously increases the computational burden, althoughthat is offset somewhat by the fact that lower-dimensionalstates can be used to capture the derived correlation struc-ture. A second consequence is that the geometric scaling of

    the process noise W(s) in Eq. 33 must change in order toreflect the correct self-similar scaling laws implied by the

    thin-plate membrane model. In particular, if the linear di-mension of the region represented by a node on an over-lapped tree decreases by a factor of drather than a factor of

    2, then the noise gain 2m( s)/2 on the right-hand side of Eq.

    33 and in Eq. 34 as well should be replaced by 2(1

    1/d)dm (s)/2. In the results presented here, we used mod-

    els with d2 7/8.

    Acknowledgments

    The authors would like to thank Dr. P.W. Fieguth, Dr. M.Schneider, Dr. W. M. Wells III, and Dr. W. C. Karl fortheir contributions to this paper. This work was supported

    by AFOSR grant F49620-98-1-0349 and by subcontractGC123919NGD from Boston University under the AFOSRMultidisciplinary Research Program.

    References

    1. M. Basseville, A. Benveniste, K. Chou, S. Golden, R. Nikoukhah, andA. Willsky, Modeling and estimation of multiresolution stochastic

    processes, IEEE Trans. Inf. Theory 382, 766784 1992.2. K. Chou, A. Willsky, and A. Benveniste, Multiscale recursive esti-mation, data fusion, and regularization, IEEE Trans. Autom. Control393, 464478 1994.

    3. A. Tsai, J. Zhang, and A. Willsky, Multiscale methods and meanfield theory in EM procedures for image processing, presented atEighth IEEE DSP Workshop 1998.

    4. A. Tsai, Curve evolution and estimation-theoretic techniques for im-age processing, segmentation, PhD Thesis, MIT, Aug. 2000.

    5. M. Luettgen, W. Karl, and A. Willsky, Efficient multiscale regular-ization with applications to the computation of optical flow, IEEETrans. Image Process. 3, 4164 1994.

    6. D. Chandler, Introduction to Modern Statistical Mechanics, OxfordUniv. Press 1987.

    7. A. Dempster, N. Laird, and D. Rubin, Maximum likelihood fromincomplete data via the EM algorithm, J. Roy. Statist. Soc. B 39,138 1977.

    8. J. Zhang, The mean field theory in EM procedures for Markov ran-dom fields, IEEE Trans. Signal Process. 4010, 25702583 1992.

    9. M. Daniel and A. Willsky, A multiresolution methodology forsignal-level fusion and data assimilation with applications to remotesensing, Proc. IEEE 85, 164180 1997.

    10. P. Fieguth, W. Karl, and A. Willsky, Efficient multiresolution coun-terparts to variational methods for surface reconstruction, Comput.Vis. Image Underst. 70, 157176 1998.

    11. W. Irving, P. Fieguth, and A. Willsky, An overlapping tree approachto multiscale stochastic modeling and estimation, IEEE Trans. Im-age Process. 6, 15171529 1997.

    12. T. Green and J. Shapiro, Maximum-likelihood laser radar range pro-filing with the EM algorithm, Opt. Eng. 3111, 23432354 1992.

    13. D. Greer, I. Fung, and J. Shapiro, Maximum-likelihood multiresolu-tion laser radar range imaging, IEEE Trans. Image Process. 6,3646 1997.

    14. S. Geman and D. Geman, Stochastic relaxation, Gibbs distributions,and the Bayesian restoration of images, IEEE Trans. Pattern Anal.

    Mach. Intell. 6, 721741 1984.15. R. Szeliski, Bayesian Modeling of Uncertainty in Low-level Vision,

    Kluwer Academic 1989.

    Andy Tsai received his BS in electrical en-gineering from the University of Californiaat San Diego in 1993, and his SM, EE, andPhD in electrical engineering from Massa-chusetts Institute of Technology in 1995,1999, and 2000, respectively. He then con-tinued his research as a postdoctoral re-search associate at the Laboratory for In-formation and Decision Systems at theMassachusetts Institute of Technology.Currently, he is a third-year medical stu-

    dent at Harvard Medical School. His research interests fall within thefields of computer vision and image processing.

    Jun Zhang received his BS in electrical

    and computer engineering from HarbinShipbuilding Engineering Institute, Harbin,China, in 1982 and was admitted to thegraduate program of the Radio ElectronicDepartment of Tsinghua University. After abrief stay at Tsinghua, he came to the USAfor graduate study on a scholarship fromthe Li Foundation, Glen Cove, New York.He received his MS and PhD, both in elec-trical engineering, from Rensselaer Poly-

    technic Institute in 1985 and 1988, respectively. He joined the fac-ulty of the Department of Electrical Engineering and ComputerScience and currently is a professor. His research interests includeimage processing and computer vision, signal processing, and digi-tal communications. He has been an associate editor of IEEE Trans.Image Process.

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .

    1300 Optical Engineering, Vol. 40 No. 7, July 2001

  • 8/6/2019 Ref 20

    15/15

    Alan S. Willsky received the SB degreeand the PhD degree from MassachusettsInstitute of Technology in 1969 and 1973,respectively. He joined the MIT faculty in1973, and his present position is as the Ed-win S. Webster Professor of Electrical En-gineering. From 1974 to 1981 Dr. Willskyserved as assistant director of the MITLaboratory for Information and DecisionSystems. He is also a founder and member

    of the board of directors of Alphatech, Inc.,and a member of the U.S. Air Force Scientific Advisory Board. In1975 he received the Donald P. Eckman Award from the AmericanAutomatic Control Council. Dr. Willsky has held visiting positions atImperial College, London, LUniversite de Paris-Sud; and the Institutde Recherche en Informatique et Systemes Aleatoires in Rennes,France. He was program chairman for the 17th IEEE Conference onDecision and Control; has been an associate editor of several jour-nals, including IEEE Transactions on Automatic Control; has servedas a member of the Board of Governors and Vice President forTechnical Affairs of the IEEE Control Systems Society; was programchairman for the 1981 Bilateral Seminar on Control Systems held inthe Peoples Republic of China; was special guest editor of the 1992

    special issue of the IEEE Transactions on Information Theory onwavelet transforms and multiresolution signal analysis; and servedas co-chair of the 1998 ONR Working Group on the Role of Prob-ability and Statistics in Command and Control. In 1988 he was madea Distinguished Member of the IEEE Control Systems Society. Dr.Willsky has given several plenary lectures at major scientific meet-ings, including the 20th IEEE Conference on Decision and Control;the 1991 IEEE International Conference on Systems Engineering;the SIAM Conference on Applied Linear Algebra, 1991; the 1992Inaugural Workshop for the National Centre for Robust and Adap-

    tive Systems, Canberra, Australia; the 1993 IEEE Symposium onImage and Multidimensional Signal Processing, Cannes, France;and the 1997 SPIE Wavelets and Applications Symposium, San Di-ego. Dr. Willsky is the author of the research monograph DigitalSignal Processing and Control and Estimation Theory and is coau-thor of the undergraduate text Signals and Systems. He wasawarded the 1979 Alfred Noble Prize by the ASCE and the 1980Browder J. Thompson Memorial Prize Award by the IEEE for a pa-per excerpted from his monograph. Dr. Willskys present researchinterests are in problems involving multidimensional and multireso-lution estimation and imaging, statistical image and signal process-ing, data fusion and estimation of complex systems, image recon-struction, and computer vision.

    Tsai, Zhang, and Willsky: Expectation-maximization algorithms . . .