Optimization Approaches in Computer Vision and Image ...€¦ · plication of optimization to shape...

14
534 IEICE TRANS. INF. & SYST., VOL.E82–D, NO.3 MARCH 1999 INVITED SURVEY PAPER Surveys on Image Processing Technologies–Algorithms, Sensors and Applications– Optimization Approaches in Computer Vision and Image Processing Katsuhiko SAKAUE , Akira AMANO †† , and Naokazu YOKOYA ††† , Members SUMMARY In this paper, the authors present general views of computer vision and image processing based on optimization. Relaxation and regularization in both broad and narrow senses are used in various fields and problems of computer vision and image processing, and they are currently being combined with general-purpose optimization algorithms. The principle and case examples of relaxation and regularization are discussed; the ap- plication of optimization to shape description that is a particu- larly important problem in the field is described; and the use of a genetic algorithm (GA) as a method of optimization is intro- duced. key words: computer vision, image processing, optimization, relaxation, regularization, snakes, genetic algorithm 1. Introduction The field of applied research in computer vision and im- age processing (hereafter abbreviated as CVIP) is ex- panding from industrial applications under controlled lighting to vision systems which people use in their en- vironment. What is required in such expansion is a robust method that is resistant to noise and environ- mental changes. Application of optimization may pro- vide a breakthrough for CVIP. Specifically, this is an approach to find out the most probable solution from images. Conventionally, CVIP problems, such as extrac- tion of curved lines, have been resolved by applying a method to determine parameters by the least squares method or some other means by giving appropriate functions, such as polynomials and spline functions. The method of Otsu [1] which was presented in 1979, contemplates optimal function fitting problems, includ- ing the simplicity of the functions to be fitted. With this method, a square norm of a quadratic differential of a function is used as the “appropriateness” of the func- tion to be fitted. This method is considered to have made a great step forward from the previous stage of problem setting, where the functions were selected by humans and only the parameter setting was resolved numerically. Manuscript received January 2, 1999. The author is with Electrotechnical Laboratory, Tsukuba-shi, 305–8568 Japan. †† The author is with Hiroshima City University, Hiroshima-shi, 731–3194 Japan. ††† The author is with the Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma- shi, 630–0101 Japan. A similar means of problem setting was employed in other fields of engineering at that time [2]. The method of Isomichi [3], [4], which was presented in the same period, formulates problems to fit more general functions of one-dimensional signals in quantified em- pirical data. This method defines the problem explic- itly, i.e., the solution is a function with which the sum of the error between the fitting function and the observed data, and the function that evaluates the appropriate- ness of the fitting function, becomes the minimum. Methods to minimize the linear sum of indices with different dimensions in this manner are widely used in the processing of pattern data, with which input and output are ambiguous. However, CVIP problems are often incompatible with computer calculations, since the function to be fitted is not obtained by analysis or is too complicated. Thus, attempts have been made in the field of CVIP to obtain quasi-optimal solutions, which are coherent overall, in the framework of relaxation. To conduct numerical analysis of large-scale simul- taneous equations, a series of methods called either the iteration method or the relaxation method is widely used. With these methods, the optimal solution is con- verged in the final stage by iterating calculations to correct an approximate solution that is obtained at the start. Although the methodology of relaxation has not been defined explicitly, its concept is applied to vari- ous problems. The definition of relaxation that is most widely used in the field of CVIP is “the general name of methods with which a local constraint is transmitted to the totality by the iteration of parallel operations to minimize (or maximize) a certain index defined to the totality.” Examples of the application of relaxation in a broad sense include Waltz’s filtering [5] and the relax- ation labeling [6] of Rosenfeld, et al., both of which were presented in the 1970’s. Thereafter, attempts were made to solve simultaneous equations by the (original) relaxation method in some fields of computer vision dealing with early vision. In all of these attempts, the simultaneous equations are resolved in a calculus of variations framework by formulating the early vision as optimization problems. This trend has become more pronounced since Poggio, et al. [7] explained the early vision by the regularization theory and proposed an in- tegrated framework. In the field of neural computing, a neural network

Transcript of Optimization Approaches in Computer Vision and Image ...€¦ · plication of optimization to shape...

534IEICE TRANS. INF. & SYST., VOL.E82–D, NO.3 MARCH 1999

INVITED SURVEY PAPER Surveys on Image Processing Technologies–Algorithms, Sensors and Applications–

Optimization Approaches in Computer Vision and Image

Processing

Katsuhiko SAKAUE†, Akira AMANO††, and Naokazu YOKOYA†††, Members

SUMMARY In this paper, the authors present general viewsof computer vision and image processing based on optimization.Relaxation and regularization in both broad and narrow sensesare used in various fields and problems of computer vision andimage processing, and they are currently being combined withgeneral-purpose optimization algorithms. The principle and caseexamples of relaxation and regularization are discussed; the ap-plication of optimization to shape description that is a particu-larly important problem in the field is described; and the use ofa genetic algorithm (GA) as a method of optimization is intro-duced.key words: computer vision, image processing, optimization,relaxation, regularization, snakes, genetic algorithm

1. Introduction

The field of applied research in computer vision and im-age processing (hereafter abbreviated as CVIP) is ex-panding from industrial applications under controlledlighting to vision systems which people use in their en-vironment. What is required in such expansion is arobust method that is resistant to noise and environ-mental changes. Application of optimization may pro-vide a breakthrough for CVIP. Specifically, this is anapproach to find out the most probable solution fromimages.

Conventionally, CVIP problems, such as extrac-tion of curved lines, have been resolved by applying amethod to determine parameters by the least squaresmethod or some other means by giving appropriatefunctions, such as polynomials and spline functions.The method of Otsu [1] which was presented in 1979,contemplates optimal function fitting problems, includ-ing the simplicity of the functions to be fitted. Withthis method, a square norm of a quadratic differential ofa function is used as the “appropriateness” of the func-tion to be fitted. This method is considered to havemade a great step forward from the previous stage ofproblem setting, where the functions were selected byhumans and only the parameter setting was resolvednumerically.

Manuscript received January 2, 1999.†The author is with Electrotechnical Laboratory,

Tsukuba-shi, 305–8568 Japan.††The author is with Hiroshima City University,

Hiroshima-shi, 731–3194 Japan.†††The author is with the Graduate School of Information

Science, Nara Institute of Science and Technology, Ikoma-shi, 630–0101 Japan.

A similar means of problem setting was employedin other fields of engineering at that time [2]. Themethod of Isomichi [3], [4], which was presented in thesame period, formulates problems to fit more generalfunctions of one-dimensional signals in quantified em-pirical data. This method defines the problem explic-itly, i.e., the solution is a function with which the sum ofthe error between the fitting function and the observeddata, and the function that evaluates the appropriate-ness of the fitting function, becomes the minimum.

Methods to minimize the linear sum of indices withdifferent dimensions in this manner are widely used inthe processing of pattern data, with which input andoutput are ambiguous. However, CVIP problems areoften incompatible with computer calculations, sincethe function to be fitted is not obtained by analysis or istoo complicated. Thus, attempts have been made in thefield of CVIP to obtain quasi-optimal solutions, whichare coherent overall, in the framework of relaxation.

To conduct numerical analysis of large-scale simul-taneous equations, a series of methods called either theiteration method or the relaxation method is widelyused. With these methods, the optimal solution is con-verged in the final stage by iterating calculations tocorrect an approximate solution that is obtained at thestart. Although the methodology of relaxation has notbeen defined explicitly, its concept is applied to vari-ous problems. The definition of relaxation that is mostwidely used in the field of CVIP is “the general nameof methods with which a local constraint is transmittedto the totality by the iteration of parallel operations tominimize (or maximize) a certain index defined to thetotality.”

Examples of the application of relaxation in abroad sense include Waltz’s filtering [5] and the relax-ation labeling [6] of Rosenfeld, et al., both of whichwere presented in the 1970’s. Thereafter, attempts weremade to solve simultaneous equations by the (original)relaxation method in some fields of computer visiondealing with early vision. In all of these attempts,the simultaneous equations are resolved in a calculusof variations framework by formulating the early visionas optimization problems. This trend has become morepronounced since Poggio, et al. [7] explained the earlyvision by the regularization theory and proposed an in-tegrated framework.

In the field of neural computing, a neural network

SAKAUE et al: OPTIMIZATION APPROACHES IN COMPUTER VISION AND IMAGE PROCESSING535

called the Boltzmann machine, which is based on theprinciple of analogy to statistical mechanics, was proveneffective as an optimization machine for solving prob-lems based on certain principles of energy minimiza-tion [8], [9]. Expanding on this approach, Geman, etal. [10] established stochastic relaxation for applicationto image restoration, which greatly affected subsequentresearch in this field.

Relaxation and regularization in both broad andnarrow senses are used in various fields and problemsof CVIP, and they are currently being combined withgeneral-purpose optimization algorithms. In this pa-per, the principle and case examples of relaxation andregularization are discussed in Sect. 2; the applicationof optimization to shape description that is a particu-larly important problem in CVIP is described in Sect. 3;and the use of a genetic algorithm (GA) as a methodof optimization is introduced in Sect. 4.

2. Utilization of Relaxation and Regulariza-tion in CVIP

2.1 Relaxation and Regularization

2.1.1 Deterministic Relaxation

Relaxation as a means for the numerical solution of si-multaneous equations includes, for example, the Gauss-Seidel iteration method, the Jacobian method, and theSuccessive Over-Relaxation method (SOR), all whichare used broadly in various fields. In the case of CVIP,relaxation is used as a means of resolving a calculus ofvariations which deals with the extremal problem of afunction that represents a certain index (e.g., a kind ofenergy) that needs to be minimized (or maximized) inthe entire image. Such a problem can be resolved usingeither the direct method or the Euler equation. Whenthe Euler equation, which is a necessary condition, isused, it becomes a problem of resolving simultaneousequations obtained by discretizing this partial differ-ential equation per pixel (or datum point). Since thecoefficient matrix of such simultaneous equations oftenbecomes large-scale and non-dense, relaxation is gener-ally used as a method for numerically solving the prob-lems of CVIP. In this paper, the framework of “calculusof variations + simultaneous equations + relaxation” inthe field of CVIP is called “deterministic relaxation.”

Deterministic relaxation is described below usingsome concrete examples (Fig. 1). Now, a problem to in-terpolate sporadic altitudinal data obtained by obser-vation to a curved surface (e.g., a ground surface) basedon the principle of minimum curvature is assumed [11].Here, the curved surface, u(x, y), to be obtained is con-sidered to be a thin elastic plate fixed by the observeddatum points, and interpolation is conducted by ob-taining a curved surface that minimizes the energy ofdeflection expressed by the formula below.

Fig. 1 Surface interpolation.

C(u) =∫ ∫

Ω

(∂2u

∂x2+

∂2u

∂y2

)2

dxdy (1)

This is an extremal problem of a functional with ob-served data as the boundary condition, and is also avariational problem. The Euler equation for formula(1) is a biharmonic equation as shown below.

∂4u

∂x4+ 2

∂4u

∂x2∂y2+

∂4u

∂y4= 0 (2)

Incidentally, Laplacian ci,j = ∂2u/∂x2+∂2u/∂y2 of thecoordinate (i, j) can be expressed discretely with h asa grid space as follows:

ci,j =ui+1,j + ui−1,j + ui,j+1 + ui,j−1 − 4ui,j

h2(3)

Thus, when the formula (1) is substituted by the sum-mation of all grid points, the following formula stands.

C =n∑

j=1

m∑i=1

(ci,j)2 (4)

When C is the minimum, the following is obtained.

∂C

∂ui,j= 0 (i = 1, · · · ,m; j = 1, · · · , n) (5)

From formulas (4) and (5), C is the quadratic form ofui,j ; thus, formula (5) is a simultaneous linear equationwith m×n unknowns. Furthermore, since ui,j appearsonly in terms ci,j , ci+1,j , ci−1,j , ci,j+1, ci,j−1 in C, for-mula (5) can be modified as follows.

ui+2,j + ui,j+2 + ui−2,j + ui,j−2

+2(ui+1,j+1 + ui−1,j+1 + ui+1,j−1 + ui−1,j−1)−8(ui+1,j + ui−1,j + ui,j−1 + ui,j+1) + 20ui,j

= 0 (6)

Briggs [11] modified the formula as shown below, andsolved it by the Jacobian method.

u(k+1)i,j

=120

8

(u

(k)i+1,j + u

(k)i−1,j + u

(k)i,j−1 + u

(k)i,j+1

)−2

(u

(k)i+1,j+1+u

(k)i−1,j+1+u

(k)i+1,j−1+u

(k)i−1,j−1

)−

(u

(k)i+2,j + u

(k)i,j+2 + u

(k)i−2,j + u

(k)i,j−2

)(7)

536IEICE TRANS. INF. & SYST., VOL.E82–D, NO.3 MARCH 1999

Here, ui,j at the observation points are fixed. The earlyvalue is obtained by linear interpolation from the ob-servation points. Another method of solution is em-ployed in the former half of Ref. [12]. A solution can beobtained by both of these methods by iterating localparallel operations. Formula (7) is in fact a discrete ex-pression of the Euler equation in formula (2). After all,the Euler equation was solved by these methods withempirical data as the boundary condition.

2.1.2 Stochastic Relaxation

When the energy function is not in the quadratic form,the problem becomes non-linear simultaneous equa-tions. Although a formula for iterative operations canbe derived in this case, the convergence to the optimalsolution that minimizes the energy is not guaranteedby iterative operations starting with an arbitrary ini-tial value when several solutions exist (i.e., the origi-nal energy function is non-convex and has local min-ima). The same applies to the process of energy min-imization by the Hopfield-type neural network. Thus,the means of selecting the appropriate initial value andavoiding local minima need to be contrived. To countersuch non-convex problems, a method to use multiplescales [13] is employed in the framework of determinis-tic relaxation, with which processing is conducted grad-ually from coarse scales to fine scales.

On the other hand, simulated annealing [14], [15]utilizes analogies from statistical thermodynamics tosearch for the global minimum of non-convex energyfunctions. This algorithm was proposed as a discretecombinatory optimization method, which makes thestate of a system undergo a stochastic transition by es-tablishing a stochastic process, with which the state xappears by a probability expressed by formula (8) whenx (which represents the combination of labels; i.e., thestate of the system) has energy E(x).

p(x) =1Z

exp −E(x)/T (8)

Here, Z is a constant for normalization, while T is aparameter corresponding to the “temperature” in thestatistical mechanics. With simulated annealing, thesystem converges to one of the states of energy mini-mum by the probability of 1 when the initial value ofthe temperature parameter T is sufficiently high, andwhen T→0 by decreasing the temperature continuously.

The basic steps of annealing based on the proba-bility process of Metropolis, which is often used, are asfollows:

1. A sufficiently high initial temperature T0 is deter-mined, as is the initial state of the system. Thisstate is considered as x (its energy is E(x)).

2. One freedom of the system is selected at random;its value is varied at random; and this temporarystate is considered as x′ (its energy is E(x′)).

3. The state is shifted from x to x′ at a probability ofmin 1, exp − (E(x′) − E(x)) /Tk, and the newstate is considered as x.

4. Steps 2 and 3 are iterated until reaching the equi-librium state.

5. Steps 2 to 4 are iterated while gradually decreas-ing the temperature (Tk→Tk+1) until the value ofthe energy becomes almost invariable in the equi-librium state.

Many stages of CVIP require some kind of label-ing operations to pixels, such as edge detection, seg-mentation, and stereo matching. When the MarkovRandom Field (MRF) can be hypothesized as a prob-ability model of image, i.e., when the state of the re-spective pixels depends only upon their neighborhood,the energy of the entire image is the sum of local ener-gies. Accordingly, the energy variations of the totality∆E = E(x) − E(x′), which determines the state vari-ations of the totality in Step 3 above, are equivalent tothe energy variations in the neighborhood of a pixel inattention; thus, an annealing process is established bylocal parallel operations.

A CVIP method with which the state variationsare iterated stochastically by conducting local paral-lel operations, is called stochastic relaxation [10], [16].Stochastic relaxation has hitherto been used for prob-lems of image restoration, segmentation, binocularstereo, etc.

Geman, et al. [10] applied stochastic relaxation tothe problems of image restoration. They prevented thesections of discontinuity in the region boundary frombecoming excessively smooth by preparing a stochasticvariable for the existence of region boundary, which iscalled a line process, in addition to the stochastic vari-able for the gray levels. This contrivance is used forother applied problems of CVIP [17]–[20].

Despite their capacity for dealing with varioustypes of problems, stochastic methods have a disadvan-tage in that a huge amount of operations is required toobtain solutions. It is thus desirable to use determin-istic relaxation, if possible. However, there are not somany classes (types) of problems that can be solved bydeterministic methods at present, and stochastic meth-ods are the only alternative means. Thus, a massivelyparallel computer that can adequately handle such op-erations should be developed.

2.1.3 Regularization Method

In the field of CVIP, many problems of early vision areformulated as optimization problems [21]–[27].

Problems of early vision can be considered as prob-lems to restore the descriptions of an original three-dimensional world from images created by projectingthe data of the three-dimensional world onto a two-dimensional world. However, problems are often essen-

SAKAUE et al: OPTIMIZATION APPROACHES IN COMPUTER VISION AND IMAGE PROCESSING537

tially indeterminable, because the data of the three-dimensional world that need to be restored are con-tained in the two-dimensional image in a degeneratedshape. Such problems are called “ill-posed problems,”as mentioned by Hadamard.

A well-posed problem satisfies the following condi-tions: (1) Its solution exists; (2) its solution is unique;and (3) its solution depends continuously on the initialdata. Should it fail to meet any of these conditions,it is called an ill-posed problem. A method availablefor giving a mathematically stable solution to an ill-posed problem is regularization [28]. Poggio, et al. [7]proposed that the problems of early vision be discusseduniformly in terms of the theory of regularization.

Problems that can be solved by regularization arethose with a possible solution space that is much greaterthan input. Specifically, regularization is applicablewhen possible solutions of problems exist infinitely, asin the case of a problem to restore a three-dimensionalstructure from image data that are projected two-dimensionally. Regularization is a method for obtaininga mathematically “stable” solution from such a solutionspace.

With standard regularization, when data that re-ceive a linear operation A from unknown z are observed,as expressed below,

y = Az (9)

an inverse problem that estimates z from data y is for-mulated as an optimization problem that minimizes theobjective function (called energy) as follows:

E = ||Az − y||2 + λ||Pz||2 (10)

The first term in formula (10), ||Az − y||2, expressesthe difference from the observation data, and is called apenalty functional. The second term, ||Pz||2, is calleda stabilizing functional, which expresses general con-straint conditions (e.g., smoothness) for solutions in thereal world. λ is a regularization parameter that deter-mines the proportions of these two functionals. Thisformulation is designed “to seek a solution that satis-fies the constraint conditions fairly well with few con-tradictions to observation data.” Formula (10) can beminimized in the framework of calculus of variations.With standard regularization, the norm of a quadraticform is adopted, and P is a linear operator.

How to determine the regularization parameter λis a matter of great concern. A method to estimate aregularization parameter, taking advantage of the er-ror from the amount obtained by observation, was pro-posed by Amano, et al. [29], in which a function thatminimizes the difference from the sample is presentedas a stabilizing function. Although some samples needto be prepared in advance, this method enables theutilization of errors in observed values by presentingseveral samples, and estimates the regularization pa-rameter based on these errors.

There are several other methods proposed for es-timating a regularization parameter, with which a pa-rameter is estimated as a distributed constant based onthe hypothesis that the errors between the observed val-ues and estimated values are dispersed uniformly [30]–[33]. Such methods are effective for processing theneighborhood of the boundary of discontinuity, partic-ularly when dealing with a solution space that containsdiscontinuity.

2.1.4 Countermeasures for Discontinuity

Methods using a stabilizing functional may not be ap-propriate when the solution to be obtained is not asmooth function. To counter such problems, variousstudies have been conducted to establish methods touse functions in the quadratic form as the base [1]–[4]. Other methods being studied include methods touse the spline function [34], [35], GNC and other meth-ods to realize weak continuity constraint [34], [36]–[39],and methods to use a multi-valued function [40]. De-spite its great disadvantage of being time-consuming,the stochastic relaxation with the line process, that ismentioned in Sect. 2.1.2, may also be effective becauseit is applicable to problems that cannot be solved bydeterministic methods.

Terzopoulos introduced a controlled continuityconstraint [34], [41]. This constraint is expressed by ageneralized spline with several degrees based on conti-nuity control functions. To solve problems of 3D surfacereconstruction, a functional that expresses a controlledcontinuity constraint was adopted as follows:

Sρτ (u) =12

∫ ∫Ω

ρ(x, y)[τ(x, y)(u2

xx + 2u2xy + u2

yy)

+1 − τ(x, y)(u2x + u2

y)]dxdy (11)

Here, ρ(x, y) and τ(x, y) are continuity control func-tions (0 ≤ ρ ≤ 1, 0 ≤ τ ≤ 1). This functional is aweighted linear combination of an energy functional ofa thin plate and an energy functional of a thin mem-brane, and can control the smoothness properties ofthe constraint by adjusting the control functions at ar-bitrary points.

The operations of basic control are as follows: i)ρ(x, y) and τ(x, y) are set to be non-zero in all non-continuous points (x, y); ii) the value of τ(x, y) is set tobe close to 0 when the discontinuity of direction needsto be established; and iii) the value of ρ(x, y) is set tobe close to 0 at the discontinuous points of depth.

If the values of ρ(x, y) and τ(x, y) are known inadvance, formula (11) is a quadratic form, and canbe solved in the framework of standard regularization.To detect discontinuity in advance, the so-called edge-detection method can be used. However, during the op-erations of 3D surface reconstruction, Terzopoulos [41]found that the discontinuity of depth at a point where

538IEICE TRANS. INF. & SYST., VOL.E82–D, NO.3 MARCH 1999

the moment of deflection of a thin plate was 0, and dis-covered the discontinuity of direction at the maximumpoint. A similar approach is described in Ref. [35]. Ifρ(x, y) and τ(x, y) are incorporated to calculus of vari-ations as unknown values, it is quite troublesome be-cause dual minimization processes appear.

Grimson, et al. [42] proposed a method for esti-mating the discontinuity for use in problems to restorea curved surface based on discrete depth data. Accord-ing to this restoration method, the boundary of discon-tinuity is estimated by making the error distributionuniform, using the distribution of errors between thegiven depth data and the fitted curved surface. Thisconcept is the same as that in the guideline for esti-mating a regularization parameter.

Based on an idea that partial discontinuity is plau-sible but imposes a penalty, Blake [43], [44] gave thename “weak constraint” to a constraint condition thatminimizes the summation of variations in continuoussections and penalties in discontinuous sections. Thisproblem is non-convex. GNC (graduated non-convexityalgorithm) is proposed as a method to solve this prob-lem. With this method, a sequence of functions thatfalls in-between the original objective function F (non-convex) and the convex envelope F ∗ (convex) of F iscreated, and local optimal solutions are obtained se-quentially from F ∗ by the hill-climbing method to grad-ually approach the solution at F .

Shizawa [40] succeeded in using two functions asa constraint by expanding standard regularization tomulti-valued functions. This method is applicable toa problem to restore several overlapped surfaces calledtransparency. Regularization that constrains a curvedsurface y = f(x) is expressed as a minimization prob-lem of the following function.

E(1)[f ] =N∑

i=1

(y(i) − f(x(i))

)2 + λ||Sf(x)||2 (12)

When f1(x) and f2(x) are constraint functions, the con-straint condition can be expressed as follows:

(y − f1(x))(y − f2(x))= y2 − (f1(x) + f2(x))y + f1(x)f2(x)= 0 (13)

If F (x) = f1(x)f2(x),G(x) = −(f1(x)+f2(x)), regular-ization with the two functions as constraint conditionsbecomes a minimization problem of the following func-tion.

E(2)[F,G] =N∑

i=1

F (x(i)) + G(x(i))y(i) + (y(i))2

2

+ λF ||SFF (x)||2 + λG||SGG(x)||2(14)

The advantage of this method is that there is no need to

explicitly state or estimate which one of the two curvedsurfaces the empirical data belong to.

2.2 Examples of the Application of Regularization

2.2.1 Reconstruction of a Three-Dimensional CurvedSurface

Here, a curved surface reconstruction problem that es-timates the original curved surface is established [12],[41], assuming that data d is given as a result of apply-ing a sampling operation S to a given curved surface u.This is identical to a problem of obtaining elevationsdensely from sporadic data in a digital terrain map(DTM). Although the values of empirical data werestored in the case of the problem of interpolation in-troduced in Sect. 2.1.1, the tolerance obtained from thedata of observation points needs to be given in actualoperations as shown in Fig. 2, due to the inaccuracy ofactual empirical data. With the number of pixels ofm×n, the so-called quadratic variation is often used asthe constraint of smoothness. As with formula (1), thisis a thin plate spline, with which continuity is given tothe normal vector. Formula (10) can be modified asfollows:

E =n∑

j=1

m∑i=1

Si,j(ui,j − di,j)2 + λ

∫ ∫Ω

(∂2u

∂x2

)2

+2(

∂2u

∂x∂y

)2

+(∂2u

∂y2

)2

dxdy (15)

To obtain a curved surface ui,j (1 ≤ i ≤ m, 1 ≤ j ≤ n)that minimizes the above formula, simultaneous equa-tions with m× n unknowns as shown below are solvedby calculus of variations:

∂E

∂ui,j= 0 (1 ≤ i ≤ m, 1 ≤ j ≤ n) (16)

If the partial differentiations, such as ∂2/∂x2, in for-mula (15) are substituted to a difference as follows (his the grid space):

∂2u

∂x2=

ui+1,j − 2ui,j + ui−1,j

h2 (17)

Formula (16) becomes linear simultaneous equations ofthe values of elevations at all points in image Ω as fol-lows, which can easily be solved [12].

Fig. 2 Surface reconstruction.

SAKAUE et al: OPTIMIZATION APPROACHES IN COMPUTER VISION AND IMAGE PROCESSING539

∂E

∂ui,j= 2Si,j(ui,j − di,j) + λ

25h4

ui,j

− 8h4

(ui+1,j + ui−1,j + ui,j+1 + ui,j−1)

+3

2h4(ui+2,j + ui−2,j + ui,j+2 + ui,j−2)

+1

4h4(ui+2,j+2 + ui+2,j−2

+ui−2,j+2 + ui−2,j−2)

= 0 (18)

Although it is nothing more than obtaining a splinecurved surface, this method is greatly advantageousin that the perspective is quite favorable when addingother constraint conditions because the energy mini-mization is explicitly expressed by a formula using theframework of regularization. Muraki, et al. [45] ob-tained a favorable result (Fig. 3) by adding the con-straint that the variations of elevation become 0 on thecontour lines of a map, as expressed below:∫ ∫

Ω

∂u

∂x∆x(x, y) +

∂u

∂y∆y(x, y)

2

dxdy (19)

to formula (15), assuming that contour lines are par-tially obtained from a map. Here, (∆x(x, y),∆y(x, y))is the direction cosine of contour lines at a point (x, y)in the image.

2.2.2 Binocular Stereo

The standard stereo camera system that consists of

Fig. 3 3D surface from contours [45]. Topographic map andreconstructed surface.

cameras with parallel lines of sight is examined. Withthis system, the points within the three-dimensionalspace infallibly appear on the same horizontal scan-ning line (epipolar line) in the images on both left andright. It is assumed that the brightness of correspond-ing points is similar, and that the depth of the targetscene varies continuously. When the brightness of theleft and right images is expressed as L(x, y) and R(x, y),respectively, a stereo matching problem, which obtainsthe points (x + d(x, y), y) in the left image (that corre-spond to the respective points (x, y) in the right image)is formulated as a problem to obtain a disparity func-tion d(x, y) that minimizes the following functional inthe framework of regularization.

E =∫ ∫

Ω

[ L(x + d(x, y), y) −R(x, y)2

+ λdx(x, y)2 + dy(x, y)2

]dxdy (20)

Since this functional is not in a quadratic form inregard to the function d(x, y) to be obtained, the prob-lem is how to avoid local minima. There are two meth-ods available to counter this problem, i.e., a method touse deterministic relaxation by multiple scales based onan Euler equation [46],

L(x + d(x, y), y) −R(x, y) · Lx(x + d(x, y), y)−λdxx(x, y) + dyy(x, y) = 0 (21)

and a method to use stochastic relaxation by anneal-ing [20].

When discontinuous regularization proposed byTerzopoulos [41] is applied simply to an image with hightexture, a boundary of discontinuity can easily be de-tected as in Refs. [47] and [48]. Boult, et al. [49] pro-posed a method to roughly fit a curved surface to theextracted characteristic points by using a function fit-ting, and to interpolate their intervals. The scale of thefunction to be fitted here is discussed in Ref. [50].

2.2.3 Optical Flow

When the apparent field of velocity in the image is U ,its relationship with image gray level I is expressed asfollows:

It + UT∇I = 0 (22)

With the optical flow estimation by optimizationoperations, the solution is U , which minimizes the sumof the above formula, and a function that expresses thesmoothness of U . This method is called the gradient-based method [21].

Regarding stabilizing functionals, with which theterms of either linear or quadratic differentials ofbrightness are quadratic functions, in estimation prob-lems of optical flow based on the gradient-basedmethod, Snyder [51], [52] showed that functions that

540IEICE TRANS. INF. & SYST., VOL.E82–D, NO.3 MARCH 1999

meet physical conditions, i.e., 1) they are rotation in-variant and 2) the two components of the optical floware independent, are expressed by a linear combinationof two basic functions, ∇I∇T I and ∇∇T I.

Research on problems that deal with the opticalflow of discontinuous distribution has been conducted.Black [53] proposed a method to consecutively estimatethe discontinuity simultaneously with the flow itself byintroducing a process that is similar to the line process.There has also been an attempt to formulate the shapeof the region, where the optical flow changes continu-ously, in the line process, using the shape and length ofthe boundary of discontinuity [54].

Considering that the optical flow in general scenes,such as outdoor views, is piecewise continuous, Schnorrfitted piecewise smooth functions, by dividing the re-gion where the distribution of the optical flow obtainedinitially is continuous [55], [56].

2.2.4 Segmentation

Geiger, et al. [57] proposed a framework for dealing withsegmentation problems uniformly when the stochasticproperties of discontinuity are identified. They provedthat the methods to deal with region boundaries usingthe line process have generality, and adopted a methodto use a single-line process. This is more advantageousthan a method to use the line process of both the x-axis and y-axis, since the single-line process annuls boththe anisotropy and the property that the boundary ofdiscontinuity is prone to follow the axis. Then, theyadopted an energy function based on the Markov Fieldas a penalty function.

Boult, et al. [58] proposed a method to fit a func-tion that describes data to a set of pixels that are al-lowed to be overlapped without conducting segmenta-tion explicitly. This is based on the idea that, withconventional methods for segmentation, the estima-tion of the regional boundary and the estimation prob-lems of data within the region inevitably become awhich-came-first-the-chicken-or-the-egg question. Thismethod proves that segmentation is essentially restoredto a function fitting problem.

2.2.5 Edge Detection

Chen, et al. [59] conducted edge detection by regular-ization using the Cubic B-Spline. With this method,problems are solved as uni-dimensional regularizationproblems by taking the projection to a given line in asmall region. On the other hand, Gokmen, et al. [60]conducted formulation in a similar manner as with theline process by using a distributed constant as a reg-ularization parameter. They also employed a methodto consecutively estimate regularization parameters, sothat the distribution of errors between the observedvalues and estimated values become uniform.

3. Extraction of Shape Descriptions by Opti-mization

3.1 Parametric Model Fitting

Operations to fit functions to a set of datum points areconducted in the various stages of CVIP, such as ex-traction of target objects from gray-scale images and/ordepth images. Basically, the fitting error between thedatum points and the function is defined, and thenthe parameter of the function that can minimize theerror is determined. The easiest means is the leastsquares method for the linear fitting of a function totwo-dimensional datum points.

Flexible shape models, such as superquadrics,which are created by expanding quadrics, and a Blobby-model, have recently been used as shape expressionmodels for computer graphics and object recogni-tion. Several reports on attempts to automatically fitsuch function models to depth images obtained by alaser range finder have already been presented [61], [62].When the functions are complicated, additional con-straint conditions need some contrivances because thefitting errors often have numerous points of local min-ima.

3.2 Fitting of Non-parametric Models

3.2.1 Snakes

There is an approach to formulate the shape extractionfrom images as the analogy of an energy minimizationproblem in dynamics. The behavior (deformation) of adeformable shape model is expressed as an energy thatis a linear combination of the trend of the model itselfand the constraints from the outside, and the targetobject is extracted from the image by finding the stateof stability of the energy local minima.

Kass, et al. [63] proposed a method called the ac-tive contour model (“Snakes”), with which the contourlines are detected by dynamically moving the contourlines themselves. This is a contour extraction methodthat takes advantage of the global properties of contourlines, and is still used widely [64]–[66].

Snakes is formulated as a minimization problem ofan energy function that is a linear sum of the imageenergy and the internal energy which is defined on agiven contour v(s) (= (x(s), y(s)); 0 ≤ s ≤ 1) in theimage.

Esnakes(v(s))

=∫ 1

0

Eint(v(s)) + wimageEimage(v(s))ds (23)

Here, Eint is the internal energy that expresses thesmoothness of contours, called the energy that is con-cerned in the shape, to which the sum of energy that is

SAKAUE et al: OPTIMIZATION APPROACHES IN COMPUTER VISION AND IMAGE PROCESSING541

concerned in the linear differential vs(s) of the contourv(s) and the quadratic differential vss(s) is often used.

Eint(v(s)) =12α|vs(s)|2 + β|vss(s)|2 (24)

α and β are weight coefficients. The value of Eint

decreases as the contours become more circular andshorter. Eimage is the image energy, which is expressedby the linear differential of image gray level I(v(s)) asfollows:

Eimage(v(s)) = −|∇I(v(s))|2 (25)

The value of this function decreases as the sum of thelinear differential of image gray levels in the pixels onthe contours increases. Accordingly, Snakes convergesto a smooth contour with great gray scale variationsin the image. In general, an initial contour is givenas the initial value by some means, and operations forobtaining the local minima are conducted starting withthis initial value by, for example, the steepest descentmethod.

Aggressive attempts are being made to expandboth the “Active Net,” which is an extended two-dimensional net model [67], and the split-and-mergetype Snakes [68] (Fig. 4).

3.2.2 Improvement of Snakes

Although Snakes can provide satisfactory results whenthe quality of the target images is good, it has the fol-lowing problems:

• The solution varies depending on the position ofthe initial contour.

• Introduction of a priori knowledge is difficult.• Operations are time-consuming.• The solution varies depending on parameters α and

β.• It cannot follow the variations of the topology.

Several approaches have been taken to counter theseproblems.(1) Analysis of the solution spaceWith Snakes and other methods that use optimizationoperations, the local minima are usually obtained byconducting iterative operations starting with an appro-priate initial value. In this case, the monotonousnessof the solution space is important. When the solutionspace has complicated non-linearity, the local minimaobtained by calculations depend greatly on the initialvalue.

To counter this problem, Davatzikos, et al. [69]identified by analysis how the properties of the func-tions used for minimization should be like for them tobecome convex. This method is currently not practi-cal, since it requires the values and partial differentia-tion of the functions, which conduct minimization, atall the positions. However, continued research on this

(a) (b)

(c) (d)

(e) (f)

(g) (h)

Fig. 4 Split-and-merge contour model [68]. Tracking of tem-porarily overlapping persons; (a) initialization at 1st frame, (b)result at 1st frame, (c) 19th, (d) 32nd, (e) 35th, (f) 43rd, (g) 47thand (h) 60th frames.

approach should clarify the scope of applicable prob-lems.(2) Introduction of a priori knowledgeIt is often the case that the shape of the target objectfor detection is already known a priori in the processingof contour detection in a given closed region from theimage. Based on such a priori knowledge, research hasbeen conducted to develop a stable contour detectionmethod that is resistant to noise.

Many of the methods to introduce a priori knowl-edge of shapes use the curvature in the respective pointson the contour, and assume that the energy in the shapeis minimized when the contour becomes identical tothe curvature [29], [70] (Fig. 5). The problem with suchmethods is how to guarantee rotation invariance of thecontour shapes.

Other relevant methods include a method to con-duct high-speed and stable contour detection using a

542IEICE TRANS. INF. & SYST., VOL.E82–D, NO.3 MARCH 1999

MRI image sequence and a sample contour.

Contour tracking by the conventional Snake.

Contour tracking by M-Snake.

Fig. 5 Sample contour model (M-Snake [29]).

physical model [71], a method to describe the shapein accordance with grammar rules and detect contoursthat agree with the grammar [72], and a method to de-tect contours that correspond to the variations of thetopology [73].(3) High speed calculation methodsSnakes is extremely costly, if its minimum value searchproblems in the multidimensional space are calculatedwithout modification.

Amini, et al. [74] proposed a method to use dy-

namic programming [75] for calculating the discretizedenergy functions of Snakes at high speed. With thismethod, the points that move sequentially are deter-mined by dynamic programming, based on the varia-tions of the energy functions when the points that con-stitute contour lines move to the 8-neighbors.

Williams, et al. [76] proposed the Greedy Algo-rithm which is more high speed. As with the method ofAmini, et al., this method determines the movementsof movable points that constitute contour lines basedon the decrease of the energy function. At present, thismethod is used widely for high-speed calculations.(4) Snakes based on the geometric modelSnakes began to attract attention after it was pro-posed independently by Caselles [77] and Malladi [78]based on the geometric model. Unlike the conventionalSnakes that modifies contours by minimizing the energyfunction defined by the contour shape and the image,Snakes based on the geometric model detects contoursby obtaining the moving velocity of the contours basedon the space measure which is defined using the image.

This method is characterized by the fact that cal-culations are not hindered by topological variations,which may occur in the middle of calculations of con-vergence and split to several objects, since the calcu-lations themselves are not dependent upon the topol-ogy. This method has been expanded to three di-mensions [79], minimum value calculations [80], andmultiple-dimensional images [81].

4. Utilization of the Genetic Algorithm

As mentioned previously, optimization by simple reiter-ative numerical solutions often results in local minima.To avoid this problem, attempts are being made to takethe Coarse-to-Fine strategy based on multi-scales [13]and to apply simulated annealing [15].

However, the search space cannot be expanded in-finitely by these methods. To deal with CVIP problemsin the real world, it may be necessary to incorporate amethod that is capable of searching a wider space.

A candidate of such a searching method is the Ge-netic Algorithm (GA) [82], which is attracting attentionas a new method for CVIP. The GA is an algorithm forlearning, adaptation and optimum search that is real-ized by simulating the mechanism of biological evolu-tion. There is a field that aims to create artificial lifeby focusing on learning and adaptation. On the otherhand, there is a concept that GA is simply one of sev-eral optimum searching methods.

4.1 What is the Genetic Algorithm?

With GA, the respective solutions are initially regardedas “individuals” which constitute a “population” thatis a set of solutions; each solution is considered as“choromosome.” Then, a new solution is created by

SAKAUE et al: OPTIMIZATION APPROACHES IN COMPUTER VISION AND IMAGE PROCESSING543

the “cross-over” of two arbitrary solutions. Depend-ing on the situation, the solution may be renewed by“mutation.” Then, iterative processing of “selection”based on the “fitness” between the new and old solu-tions is conducted. Specifically, this method makes theentire population “evolve” to the optimum solution inthe final stage by eliminating individuals with low fit-ness by selection. The cross-over, mutation, selectionetc., are called genetic operations, which are generallyconducted stochastically based on random numbers.

The advantage of GA may be the capability ofsearching the optimum solution from a wide searchspace, since the solutions are dealt with as a popula-tion, and search is conducted parallel while maintainingthe diversity of many initial solutions. Initial solutionsare obtained in the first place, followed by various inter-mediate solutions. Analogically, it is as if many peoplestart to climb a mountain from various points; childrenwho are born by cross-over also start to climb the moun-tain from unexpected places; and somebody reaches themountain top consequently, after repeating such a pro-cess. This is the property that distinguishes GA fromsimulated annealing that is famous as a global optimumsearch method.

4.2 Examples of Applying GA to CVIP

4.2.1 Automatic Generation of the Image-ProcessingModule Sequence

The first example of applying GA to CVIP may bethe automatic generation of an image-processing mod-ule sequence, which is an optimum search problemthat cannot be solved numerically. Use of GA is con-ceived based on the capability of processing images toa fairly satisfactory level by combining limited num-bers and types of image-processing modules, assigningconstraints of data types when connecting the modules,and replacing and combining part of the sequence thatcan intuitively be understood as genetic operations.

Shortly after 1975 when Holland of the Universityof Michigan proposed GA, the Environmental ResearchInstitute of Michigan (ERIM) reported an attempt touse GA as a method to determine the contents of pro-cessing in the respective stages of Cytocomputer (apipeline image processor) that was under developmentat the institute at the time [83]. This is a search prob-lem of morphological operation sequences for extractingthe target shapes. With the random search, trials needto be made 1.3 × 1011 times; however, it was reportedthat the proper program was discovered using GA afterapproximately 5,000 trials.

References [84] and [85] report that the image-processing modules were expanded to general gray-scaleimage processing. Other relevant attempts include anattempt to narrow the searching area by introducinghierarchy [86].

4.2.2 Search in the Parameter Space

If the target object to be detected from an image canbe expressed by parameters, it can be extracted by con-ducting a search in the parameter space. For instance,if a straight line is expressed as ρ = x cos θ+y sin θ, thedetection of the straight line can be substituted to thesearch of the parameters (ρ, θ). Although Hough trans-form is commonly used, the framework of GA can alsobe used if these parameters (ρ, θ) are coded as chro-mosomes. Reference [87] reports on line detection byGA using (ρ, θ). The fitness is evaluated based on thenumber of points on the straight line expressed by theparameters.

In the pattern matching of binary shapes [88], GAis applied to a problem to search a shape that is identi-cal to the given model figure from the image. With thismethod, the chromosomes are expressed by the classesof parallel shift, rotation, and scale factors; the modelis geometrically transformed by parameters; and thefitness is evaluated by whether or not the same shapeis found in the position where the transformation wasconducted.

Superquadrics is attracting attention as a modelthat expresses a three-dimensional shape by compara-tively small number of parameters. To fit this modelto 3D depth image, it is sufficient merely to solvea problem that minimizes the fitting errors of thedepth and the normal vector between the observed dataand the model. However, a strategy for obtaining aglobal minimum solution needs to be combined withsuperquadrics, since the commonly-used deterministicnon-linear least squares method derives only a localminimum solution. Interestingly enough, some favor-able results have been obtained [87] using the same datathat were also used when applying simulated anneal-ing [61] as discussed in Sect. 3.1.

4.2.3 Shape Model Fitting

With shape model fitting, a model that expresses theshape of a target object is fit to the properties in animage for recognizing the position of the target object,to which GA can be used as a searching strategy. Withthis method, some sort of information regarding thepositions of image properties in real space is expressedas chromosomes and coding is well devised for the re-spective problems, unlike the method to consider theclasses of parameters as chromosomes in Sect. 4.2.2.

In the extraction of a straight line [89], for exam-ple, the respective pixels have chromosomes that con-sist of the directions and the coordinates of edges. Pix-els that inherit the trait of their parents are generatedat the internally dividing points of the parents (twoedge points) by the cross-over of the neighboring pix-els. With the primitive extraction of shapes [90], chro-

544IEICE TRANS. INF. & SYST., VOL.E82–D, NO.3 MARCH 1999

mosomes are constructed by the values of the coordi-nates at three points in the case of circle detection.The idea is that a more global circle can be detectedby crossing over two circles, i.e., one made by givingthree points in the image and the other by giving dif-ferent three points. In the polygonal approximation ofa closed curve [91], whether or not the respective pointsthat divide the closed curve into equal N sections areconsidered as the vertexes of a polynomial polygon isexpressed by 1 and 0, respectively, and coded as theN -bit chromosomes.

4.2.4 Image Reconstruction

Several attempts have been made to use GA as an op-timization method in the application of regularizationto CVIP.

In the stereo matching of binocular stereoscopic vi-sion the disparity distribution where the gray scale levelbecomes identical in the corresponding points on theleft and right is obtained. Here, the two-dimensionalactive contour model, Active Net [67], which is capa-

Fig. 6 Stereo matching using GA and Active Net [92]. Leftand right images, iterations 0, 10, 20, 40, 80, 120.

ble of conducting local optimization, and GA are com-bined (Fig. 6) [92]. In the three-dimensional shape re-construction from gray-scale images [93], GA is used tominimize the error between a shadow image in the esti-mated three-dimensional shape, under the given light-ing and picturing conditions, and a shadow image thatis obtained by observation.

In both cases, the estimated two-dimensional data(image, disparity distribution, and depth distribution)are expressed as chromosomes without coding, andthe cross-over is conducted by two-dimensional regionreshuffling. The problem can be solved only if the two-dimensional data generated by genetic operations meetthe constraint conditions particular to the problem andcan be evaluated quantitatively. Specifically, the recon-struction can be solved as a forward problem not as aninverse problem. This may be the most advantageouspoint in using GA for image reconstruction.

5. Conclusion

In this paper, the authors presented general views ofimage processing and computer vision based on opti-mization. It should be noted that diverse approaches,too numerous to be covered in this paper, have beentaken. Nonetheless, a big question that remains to beanswered is: Why can the visual function of “findingthe most probable solution from an image” be doneby humans quickly, accurately, and naturally, but is sodifficult for a machine?

References

[1] N. Otsu, “On a family of spline functions for smoothcurve fitting,” IECE Technical Report, PRL79-47, 1979 (inJapanese).

[2] G. Wahba and J. Wendelberger, “Some new mathemati-cal methods for variational objective analysis using splinesand cross validation,” Monthly Weather Review, vol.108,pp.1122–1143, 1980.

[3] Y. Isomichi, “Inverse-quantization method for digital sig-nals and images—Area-approxmation type,” IEICE Trans.,vol.J64-A, no.4, pp.285–292, April 1981.

[4] Y. Isomichi, “Inverse-quantization method for digital sig-nals and images—Point-approximation type,” IECE Trans.,vol.J63-A, no.11, pp.815–821, Nov. 1980 (in Japanese).

[5] D.L. Waltz, “Understanding line drawings of scenes withshadows,” in The Psychology of Computer Vision, P.H.Winston, ed., McGraw-Hill, New York, pp.19–91, 1975.

[6] A. Rosenfeld, R.A. Hummel, and S.W. Zucker, “Scene la-beling by relaxation operations,” IEEE Trans. SMC, vol.6,no.6, pp.420–433, 1976.

[7] T. Poggio, V. Torre, and C. Koch, “Computational vi-sion and regularization theory,” Nature, vol.317, no.6035,pp.314–319, 1985.

[8] S.E. Farlman, G.E. Hinton, and T.J. Sejnowski, “Massivelyparallel architectures for AI: NETL, thistle, and Boltzmanmachines,” Proc. AAAI-83, pp.109–113, 1983.

[9] G.E. Hinton and T.J. Sejnowski, “Optimal perceptual in-ference,” Proc. IEEE CVPR ’83, pp.448–453, 1983.

[10] S. Geman and D. Geman, “Stochastic relaxation, Gibbs dis-tribution, and the Bayesian restoration of images,” Trans.

SAKAUE et al: OPTIMIZATION APPROACHES IN COMPUTER VISION AND IMAGE PROCESSING545

PAMI, vol.6, no.6, pp.721–741, 1985.[11] L.C. Briggs, “Machine contouring using minimum curva-

ture,” Geophysics, vol.39, no.1, pp.39–48, 1974.[12] W.E.L. Grimson, “An implementation of a computational

theory of visual surface interpolation,” CVGIP, vol.22,pp.39–69, 1983.

[13] A.P. Witkin, D. Terzopoulos, and M. Kass, “Signal match-ing through scale space,” Int. Journal Computer Vision,vol.1, no.2, pp.133–144, 1987.

[14] S. Kirkpatrick, C.D. Gelatt, Jr., and M.P. Vecchi, “Op-timization by simulated annealing,” Science, vol.220,no.4598, pp.671–680, 1983.

[15] V. Granville, M. Krivanek, and J.P. Rasson, “Simulatedannealing: A proof of convergence,” IEEE Trans. PAMI,vol.16, pp.652–656, 1994.

[16] M.S. Landy, “A brief survey of knowledge aggregationmethods,” Proc. 8th ICPR, pp.248–252, 1986.

[17] D. Lee and G.W. Wasilkowski, “Discontinuity detection andthresholding—A stochastic approach,” Proc. IEEE CVPR’91, pp.208–214, 1991.

[18] D. Keren and M. Werman, “Probabilistic analysis of regu-larization,” IEEE Trans. PAMI, vol.15, no.10, pp.982–995,1993.

[19] D. Keren and M. Werman, “A Bayesian framework for reg-ularization,” ICPR ’94, pp.72–76, 1994.

[20] S.T. Barnard, “Stochastic stereo matching over scale,” Int.Journal Computer Vision, vol.3, no.1, pp.17–32, 1989.

[21] B.K.P. Horn and B.G. Shunk, “Determining optical flow,”AI, vol.17, pp.185–203, 1981.

[22] K. Ikeuchi and B.K.P. Horn, “Numerical shape from shad-ing and occluding boundaries,” AI, vol.17, pp.141–184,1981.

[23] D. Marr and T. Poggio, “A theory of human stereo vision,”Proc. Royal Soc. London Ser. B 204, pp.301–328, 1979.

[24] A. Blake, “Relaxation labeling—The principle of least dis-turbance,” Pattern Recognition Letters, vol.1, nos.5–6,pp.385–391, 1983.

[25] D. Terzopoulos, “Multilevel computational processes forvisual surface reconstruction,” CVGIP, vol.24, pp.52–96,1983.

[26] T. Pajdla and V. Hlavac, “Surface discontinuities in rangeimages,” 4th. ICCV, pp.524–528, 1993.

[27] T. Poggio and C. Koch, “Ill-posed problems in early vision:From computational theory to analogue networks,” Proc.Royal Soc. London Ser. B 226, pp.303–323, 1985.

[28] A.N. Tikhonov and V.Y. Arsenin, “Solutions of Ill-PosedProblems,” Winston, Washington DC, 1977.

[29] A. Amano, Y. Sakaguchi, M. Minoh, and K. Ikeda, “Snakesusing a sample contour model,” Proc. 1st ACCV, pp.538–541, 1993.

[30] R. Hummel and R. Moniot, “Solving ill-conditioned prob-lems by minimizing equation error,” 1st ICCV, pp.527–533,1987.

[31] D. Keren and M. Werman, “Variations on regularization,”ICPR ’90, pp.93–98, 1990.

[32] A.M. Thompson, J.C. Brown, J.W. Kay, and D.M.Titterington, “A study of methods of choosing the smooth-ing parameter in image restoration by regularization,”IEEE Trans. PAMI, vol.13, no.4, pp.326–339, 1991.

[33] N. Mukawa, “Vector field reconstruction using variance-based resolution control and its application to optic flow es-timation,” IEICE Trans., vol.J78-D-II, no.7, pp.1028–1038,July 1995 (in Japanese) (published in English in Systemsand Computers in Japan, vol.27, no.13, 1996).

[34] D. Terzopoulos, “Regularization of inverse visual problemsinvolving discontinuities,” IEEE Trans. PAMI, vol.8, no.4,pp.413–424, 1986.

[35] D. Lee and T. Pavlidis, “One-dimensional regularizationwith discontinuities,” Trans. PAMI, vol.10, no.6, pp.822–829, 1988.

[36] A. Blake and A. Zisserman, “Some properties of weak con-tinuity constraints and the GNC algorithm,” Proc. IEEECVPR ’86, pp.656–667, 1986.

[37] S.Z. Li, “Reconstruction without discontinuities,” 3rd.ICCV, pp.709–712, 1990.

[38] R.L. Stevenson, B.E. Schmitz, and E.J. Delp, “Disconti-nuity preserving regularization of inverse visual problems,”IEEE Trans. Syst. Man. & Cybern., vol.24, no.3, pp.455–469, 1994.

[39] C. Schnorr, “Unique reconstruction of piecewise smoothimages by minimizing strictly convex nonquadratic func-tionals,” Journal Mathematical Imaging and Vision, vol.4,pp.189–198, 1994.

[40] M. Shizawa, “Extension of standard regularization the-ory into multi-valued functions—Reconstruction of smoothmultiple surfaces,” IEICE Trans., vol.J76-D-II, no.6,pp.1146–1156, 1994 (in Japanese).

[41] D. Terzopoulos, “The computation of visible-surface repre-sentations,” IEEE Trans. PAMI, vol.10, no.4, pp.417–438,1988.

[42] W.E.L. Grimson and T. Pavlidis, “Discontinuity detectionfor visual surface reconstruction,” CVGIP, vol.30, pp.316–330, 1985.

[43] A. Blake, “The least-disturbance principle and weak con-straints,” Pattern Recognition Letters, vol.1, nos.5–6,pp.393–399, 1983.

[44] A. Blake, “Comparison of the efficiency of deterministicand stochastic algorithms for visual reconstruction,” IEEETrans. PAMI, vol.11, no.1, pp.2–12, 1989.

[45] S. Muraki, N. Yokoya, and K. Yamamoto, “3D surface re-construction from contour line image by a regularizationmethod,” Proc. SPIE, vol.1395, Close-Range Photogram-metry Meets Machine Vision, pp.226–233, Sept. 1990.

[46] N. Yokoya, “Surface reconstruction directly from binocu-lar stereo images by multiscale-multistage regularization,”ICPR ’92, I, pp.642–646, 1992.

[47] R. March, “A regularization model for stereo vision withcontrolled continuity,” Pattern Recognition Letters, vol.10,pp.259–263, 1989,

[48] R. March, “Computation of stereo disparity using regular-ization,” Pattern Recognition Letters, vol.8, pp.181–187,1988,

[49] T.E. Boult and L. Chen, “Analysis of two new stereo algo-rithms,” Proc. IEEE CVPR ’88, pp.177–182, 1988.

[50] T.E. Boult, “What is regular in regularization?,” 3rd.ICCV, pp.644–651, 1990.

[51] M.A. Snyder, “The mathematical foundations of smooth-ness constraints: A new class of coupled constraints,” IUW,pp.154–161, 1990.

[52] M.A. Snyder, “On the mathematical foundations ofsmoothness constraints for the deterimination of opticalflow and for surface reconstruction,” IEEE Trans. PAMI,vol.13, no.11, pp.1105–1114, 1991.

[53] M.J. Black and P. Anandan, “A framework for the robustestimation of optical flow,” 4th. ICCV, pp.231–236, 1993.

[54] S. Raghavan, N. Gupta, and L. Kanal, “Computingdiscontinuity-preserved image flow,” ICPR ’92, A, pp.764–767, 1992.

[55] C. Schnorr, “Computation of discontinuous optical flow bydomain decomposition and shape optimization,” Int. Jour-nal Computer Vision, vol.8, no.2, pp.153–165, 1992.

[56] C. Schnorr, “Determining optical flow for irregular domainsby minimizing quadratic functionals of a certain class,” Int.Journal Computer Vision, vol.6, no.1, pp.25–38, 1991.

546IEICE TRANS. INF. & SYST., VOL.E82–D, NO.3 MARCH 1999

[57] D. Geiger and A. Yuille, “A common framework for imagesegmentation,” Int. Journal Computer Vision, vol.6, no.3,pp.227–243, 1991.

[58] T.E. Boult and M. Lerner, “Energy-based segmentation ofvery sparse range surfaces,” IUW, pp.565–572, 1990.

[59] G. Chen and H. Yang, “Edge detection by regularized cu-bic B-spline fitting,” IEEE Trans. Syst., Man. & Cybern.,vol.25, no.4, pp.636–643, 1995.

[60] M. Gokmen and C.C. Li, “Edge detection and surface recon-struction using refined regularization,” IEEE Trans. PAMI,vol.15, pp.492–499, 1993.

[61] N. Yokoya, M. Kaneta, and K. Yamamoto, “Recovery ofsuperquadric primitives from a range image using simulatedannealing,” ICPR92, (I), pp.168–172, 1992.

[62] S. Muraki, “Volumetric shape description of range data us-ing blobby model,” ACM Computer Graphics, vol.25, no.4,pp.227–235, 1991.

[63] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Activecontour models,” Int. Journal Computer Vision, vol.1, no.3,pp.321–331, 1988.

[64] A.L. Yuille, P.W. Hallinan, and D.S. Cohen, “Feature ex-traction from faces using deformable templates,” Int. Jour-nal Computer Vision, vol.8, no.2, pp.99–111, 1992.

[65] S.R. Gunn and M.S. Nixon, “Snake head boundary extrac-tion using global and local energy minimisation,” ICPR ’96,pp.581–585, 1996.

[66] R. Chung and C. Ho, “Using 2D active contour models for3D reconstruction from serial sections,” ICPR ’96, pp.849–853, 1996.

[67] K. Sakaue and K. Yamamoto, “Active net model and its ap-plication to region extraction,” The Journal of the Instituteof Television Engineers of Japan, vol.45, no.10, pp.1155–1163, 1991 (in Japanese).

[68] S. Araki, N. Yokoya, and H. Takemura, “Tracking of mul-tiple moving objects using split-and-merge contour modelsbased on crossing detection,” Proc. Vision Interface ’97,pp.65–72, Kelowna, British Columbia, Canada, 1997.

[69] C. Davatzikos and J.L. Prince, “Convexity analysis of ac-tive contour problems,” Proc. IEEE CVPR ’96, pp.674–679,1996.

[70] N. Ueda and K. Mase, “Tracking moving contours us-ing energy-minimizing elastic contour models,” ECCV ’92,pp.453–457, 1992.

[71] M. Hashimoto, H. Kinoshita, and Y. Sakai, “An objectextraction method using sampled active contour model,”IEICE Trans., vol.J77-D-II, no.11, pp.2171–2178, 1994 (inJapanese).

[72] B. Olstad and A.H. Torp, “Encoding of a priori informationin active contour models,” IEEE Trans. PAMI, vol.18, no.9,pp.863–872, 1996.

[73] T. CcInerney and D. Terzopoulos, “Topologically adaptablesnakes,” 5th ICCV, pp.840–845, 1995.

[74] A.A. Amini, S. Tehrani, and T.E. Weymouth, “Using dy-namic programming for minimizing the energy of activecontours in the presence of hard constraints,” 2nd ICCV,pp.95–99, 1988.

[75] A.A. Amini, T. Weymouth, and R. Jain, “Using dynamicprogramming for solving variational problems in vision,”Trans. PAMI, vol.12, no.9, pp.855–867, 1990.

[76] D.J. Williams and M. Shah, “A fast algorithm for activecontours and curvature estimation,” CVGIP, vol.55, no.1,pp.14–26, 1992.

[77] V. Caselles, R. Kimmel, and G. Sapiro, “Geodesic activecontours,” 5th ICCV, pp.694–699, 1995.

[78] R. Malladi, J.A. Sethian, and B.C. Vemuri, “Shape mod-eling with front propagation: A level set approach,” IEEETrans. PAMI, vol.17, pp.158–175, 1995.

[79] S. Kichenassamy, A. Kumar, P. Olver, A. Tannenbaum,and A. Yezzi, “Gradient flows and geometric active contourModels,” 5th ICCV, pp.810–815, 1995.

[80] L.D. Cohen and R. Kimmel, “Global minimum for activecontour models: A minimum path approach,” Proc. IEEECVPR ’96, pp.666–673, 1996.

[81] G. Sapiro, “Vector-valued active contours,” Proc. IEEECVPR ’96, pp.680–685, 1996.

[82] D.E. Goldberg, “Genetic ALgorithms in Search, Optimiza-tion, and Machine Learning,” Addison-Wesley, 1989.

[83] A.M. Gillies, “An image processing computer which learnsby example,” SPIE, vol.155, pp.120–126, 1978.

[84] K. Yamamoto, K. Sakaue, H. Matsubara, and K.Yamagishi, “Miracle-IV: Multiple image recognition sys-tem aiming concept learning—Intelligent vision,” ICPR ’88,pp.818–821, 1988.

[85] H. Matsubara, K. Sakaue, and K. Yamamoto, “Learninga Visual Model and an Image Processing Strategy from aSeries of Silhouette Images on MIRACLE-IV,” SymbolicVisual Learning, K. Ikeuchi and M. Veloso, eds., Oxford,pp.171–191, 1997.

[86] I. Yoda, “Automatic generation of directional erosion anddilation sequence by genetic algorithms,” Proc. ACCV ’95,II, pp.103–107, 1995.

[87] H. Iba, H. deGaris, and T. Sato, “A bug-based search strat-egy for problem solving,” ETL-TR92-24, ElectrotechnicalLaboratory, 1992.

[88] T. Tagao, T. Agui, and H. Nagahashi, “Pattern matchingof binary shapes using a genetic method,” IEICE Trans.,vol.J76-D-II, no.3, pp.557–565, March 1993 (in Japanese).

[89] T. Nagao, T. Takeshi, and H. Nagahashi, “Extraction ofstraight lines using a genetic algorithm,” IEICE Trans.,vol.J75-D-II, no.4, pp.832–834, April 1992 (in Japanese).

[90] G. Roth and M.D. Levine, “A genetic algorithm for primi-tive extraction,” Proc. of the 4th International Conferenceon Genetic Algorithm, pp.487–494, 1991.

[91] T. Yamagishi and T. Tomikawa, “Polygonal approximationof closed curve by GA,” IEICE Trans., vol.J76-D-II, no.4,pp.917–919, April 1993 (in Japanese).

[92] K. Sakaue, “Stereo matching by the combination of ge-netic algorithm and active net,” IEICE Trans., vol.J77-D-II,no.11, pp.2239–2246, Nov. 1994 (in Japanese). (publishedin English in Systems and Computers in Japan, vol.27, no.1,pp.40–48, Scripta Technica, Inc., 1996.)

[93] H. Saito and N. Tsunashima, “Estimation of 3-D paramet-ric models from shading image using genetic algorithms,”ICPR ’94, A, pp.668–670, 1994.

Katsuhiko Sakaue received theB.E., M.E., and Ph.D. degrees all in elec-tronic engineering from University of To-kyo, Tokyo, Japan, in 1976, 1978 and1981, respectively. In 1981, he joined theElectrotechnical Laboratory, Ministry ofInternational Trade and Industry, and en-gaged in researches in image processingand computer vision. He received the En-couragement Prize in 1979 from IEICE,and the Paper Award in 1985 from Infor-

mation Processing Society of Japan. He is a member of IEEE,IPSJ, ITE.

SAKAUE et al: OPTIMIZATION APPROACHES IN COMPUTER VISION AND IMAGE PROCESSING547

Akira Amano received the Ph.D.degrees in computer science from KyotoUniversity, Kyoto, Japan in 1995. Dur-ing 1993 to 1995 he was a Research As-sociate at Kyoto Universiy. Since 1995,he has been an Associate Professor at De-partment of Intelligent Systems, Facultyof Information Science, Hiroshima CityUniversity. His research interests are com-puter vision, communication systems. Heis a member of Information Processing So-

ciety.

Naokazu Yokoya was born in Kochi,Japan, on August 12, 1951. He receivedthe B.E., M.E., and Ph.D. degrees all ininformation and computer sciences fromOsaka University, Osaka, Japan, in 1974,1976 and 1979, respectively. In 1979,he joined the Electrotechnical Laboratory,Japan. From 1986 to 1987 he was a visit-ing professor at the McGill Research Cen-tre for Intelligent Machines, McGill Uni-versity, Canada. Since 1993 he has been

a professor at the Nara Institute of Science and Technology(NAIST), as well as the director of the Information TechnologyCenter, NAIST, since 1998. Dr. Yokoya’s research interests in-clude image processing, computer vision, and virtual/augmentedreality. He received the Best Paper Award in 1990 from IPSJ.