Th e Univ ers it y of W e st er n A ustralia 95/4 · Im age F e a t ure s rom Ph as e Con gruency P...

The University of Western Australia

Department of Computer Science

Technical Report 95/4March 1995Revised June 1995Image Features FromPhase CongruencyPeter KovesiRobotics and Vision Research GroupAbstractImage features such as step edges, lines and Mach bands all give rise to points where theFourier components of the image are maximally in phase. The use of phase congruency formarking features has signi�cant advantages over gradient based methods. It is a dimension-less quantity that is invariant to changes in image brightness or contrast, hence it providesan absolute measure of the signi�cance of feature points. This allows the use of universalthreshold values that can be applied over wide classes of images. This paper presents a newway of calculating phase congruency through the use of wavelets. The existing theory thathas been developed for 1D signals is extended to allow the calculation of phase congruencyin 2D images. It is shown that for good localization it is important to consider the spreadof frequencies present at a point of phase congruency. An e�ective method for identifying,and compensating for, the level of noise in an image is presented. Finally, it is argued thathigh-pass �ltering should be used to obtain image information at di�erent scales. With thisapproach the choice of scale only a�ects the relative signi�cance of features without degradingtheir localization.Keywordscomputer vision, feature detection, phase congruencyCR categoriesI.4

Image Features From Phase CongruencyPeter KovesiDepartment of Computer ScienceThe University of Western AustraliaNedlands, W.A. 6009email: [email protected] features such as step edges, lines and Mach bands all give rise to pointswhere the Fourier components of the image are maximally in phase. The use of phasecongruency for marking features has signi�cant advantages over gradient based methods.It is a dimensionless quantity that is invariant to changes in image brightness or contrast,hence it provides an absolute measure of the signi�cance of feature points. This allowsthe use of universal threshold values that can be applied over wide classes of images.This paper presents a new way of calculating phase congruency through the use ofwavelets. The existing theory that has been developed for 1D signals is extended toallow the calculation of phase congruency in 2D images. It is shown that for goodlocalization it is important to consider the spread of frequencies present at a point ofphase congruency. An e�ective method for identifying, and compensating for, the levelof noise in an image is presented. Finally, it is argued that high-pass �ltering should beused to obtain image information at di�erent scales. With this approach the choice ofscale only a�ects the relative signi�cance of features without degrading their localization.1 IntroductionIn searching for parameters to describe the signi�cance of image features, such as edges, weshould be looking for dimensionless quantities, in particular, measures that are invariant withrespect to image illumination and magni�cation. Such quantities would provide an absolutemeasure of the signi�cance of feature points that could be applied universally to any imageirrespective of image illumination and magni�cation.Gradient based edge detection methods such as those developed by Sobel (Pringle 1969),Marr and Hildreth (1980), Canny (1983, 1986) and others are sensitive to image illuminationvariations. Intensity gradient has units of lumens/radian (pixel coordinates represent viewingdirection and hence have angular units). Intensity gradients in images depend on manyfactors, including illumination, blurring and magni�cation. For example, doubling the sizeof an image while leaving its intensity values unchanged will halve all the gradients in theimage. Any gradient based edge detection process will need to use a threshold modi�ed1

appropriately. However, in general, one does not know in advance the overall brightness ormagni�cation of an image. The image gradient values that correspond to signi�cant edgesare usually determined empirically. A limited number of e�orts have been made to determinethreshold values automatically. In his thesis, Canny (1983) sets his thresholds on the basis oflocal estimates of image noise obtained via Weiner �ltering. However, the details of settingthresholds on this basis, and the e�ectiveness of this approach are not reported. Cannyalso introduced the idea of thresholding hysteresis which has proved to be a useful heuristic.Kundu and Pal (1986) devised a method of thresholding based on human psychophysicaldata where contrast sensitivity varies with overall illumination levels. However, it is hardto provide any concrete guide to the �tting of a model of contrast sensitivity relative to adigitized grey scale of 0{255. More recently Fleck (1992a, 1992b) suggests setting thresholdsat some multiple (typically 3 to 5) of the expected standard deviation of the operator outputwhen applied to camera noise. This approach of course, requires detailed a priori knowledgeof the noise characteristics of any camera used to take an image.A new model of feature perception called the Local Energy Model has been developed byMorrone et al. (1986) and Morrone and Owens (1987). This model is not based on the use oflocal intensity gradients for feature detection. Other work on this model of feature perceptioncan be found in Morrone and Burr (1988), Owens et al. (1989), Venkatesh and Owens (1990),Kovesi (1991, 1993), Robbins and Owens (1994) and Owens (1994). This model postulatesthat features are perceived at points in an image where the Fourier components are maximallyin phase. A wide range of feature types give rise to points of high phase congruency. Theseinclude step edges, line and roof edges, and Mach bands. Morrone and Burr (1988) showthat this model successfully explains a number of psychophysical e�ects in human featureperception. Additionally, of particular interest, it is possible to construct a dimensionlessmeasure of phase congruency at any point in an image. Values of phase congruency vary froma maximum of 1, indicating a very signi�cant feature, down to 0 indicating no signi�cance.This allows one to specify a threshold to pick out features before an image is seen.This paper describes a new way of calculating phase congruency using Gabor wavelets,and is organized as follows: The theory behind the calculation of phase congruency in onedimensional signals is introduced. It is then shown how phase congruency can be calculatedfrom Gabor wavelets - geometrically scaled spatial �lters in quadrature. The paper thenmoves on to consider the e�ect of noise in the calculation of phase congruency and developsan e�ective method for identifying, and compensating for, these e�ects. This is followed bya section covering the issues involved in extending this theory to 2D images. It is shownthat for good localization it is important to consider the spread of frequencies present at apoint of phase congruency. The issue of analysis at di�erent scales is then considered andit is argued that high-pass �ltering should be used to obtain image information at di�erentscales instead of the more usually applied low-pass �ltering. Finally, some results and theconclusion are presented. An appendix containing implementational details is also included.2 Local Energy and Phase CongruencyThe local energy model of feature detection (Morrone et al. 1986, 1987) postulates thatfeatures are perceived at points of maximum phase congruency in an image. For example,2

when one looks at the Fourier series that makes up a square wave all the Fourier componentsare sine waves that are exactly in phase at the point of the step at an angle of 0 or 180 degreesdepending on whether the step is upward or downward. At all other points in the squarewave phase congruency is low. Similarly one �nds that phase congruency is a maximum atthe peaks of a triangular wave (at an angle of 90 or 270 degrees). Indeed an interestingpoint about using phase congruency to mark features of interest is that one is not makingany assumption about the shape of the waveform at all. One is simply looking for points inthe image where there is a high degree of order in the Fourier domain.We shall �rst consider one dimensional signals. The phase congruency function is developedfrom the Fourier series expansion of a signalF (x) =Xn An cos(n!x+ �n) ; (1)where ! is a constant (usually 2�) sine terms in the series are represented via the phaseo�set �n. The phase congruency function is de�ned asPC(x) = max�2[0;2�]PnAn cos(n!x+ �n � �)PnAn : (2)The value of � that maximizes equation 2 is the weighted mean phase angle of all the Four-ier terms at the point being considered. Taking the cosine of the di�erence between theactual phase angle of a frequency component and this weighted mean, �, generates a quant-ity approximately equal to one minus half this di�erence squared (the Taylor expansion ofcos(x) � 1 � x2=2 for small x). Thus �nding where phase congruency is a maximum isapproximately equivalent to �nding where the weighted standard deviation of phase anglesis a minimum (see Figure 1).An

ωx + φn n

weighted mean of

Fourier components

θ

Figure 1: Polar diagram of the components of a Fourier series at a point in a signal.As it stands phase congruency is a rather awkward quantity to calculate. As an alternativeto this Venkatesh and Owens (1989) show that points of maximum phase congruency can be3

calculated equivalently by searching for peaks in the local energy function. The local energyfunction is de�ned for a one dimensional luminance pro�le, F (x), asE(x) = qF 2(x) +H2(x) ; (3)where H(x) is the Hilbert transform of F (x) (a 90 degree phase shift of F (x)).Venkatesh and Owens show that energy is equal to phase congruency scaled by the sum ofthe Fourier amplitudes, that is E(x) = PC(x)Xn An: (4)Thus the local energy function is directly proportional to the phase congruency function, sopeaks in local energy will correspond to peaks in phase congruency.Rather than compute local energy via the Hilbert transform of the original luminance pro�leone can calculate a measure of local energy by convolving the signal with a pair of �lters inquadrature. The signal is �rst convolved with a �lter designed to remove the DC componentfrom the image. This result is saved and the image is then convolved with a second �lter thatis in quadrature with the �rst (the Hilbert transform of the �rst). This gives us two signals,each being a band passed version of the original, and one being a 90 degree phase shift ofthe other. The results of the two convolutions are then squared and summed.The calculation of energy from spatial �lters in quadrature pairs has been central to manymodels of human visual perception, for example those proposed by Heeger (1987, 1988,1992), Adelson and Bergen (1985) and Watson and Ahumada (1985) to name just a few. Thesigni�cance of Venkatesh and Owens' work is that they provide another explanation for theperceptual importance of energy: Peaks in the energy function correspond to points wherephase congruency is a maximum. Other researchers who have studied the use of local energyfor feature detection are Perona and Malik (1990), and Freeman (1992).While the use of the local energy function to �nd peaks in phase congruency is computa-tionally convenient it does not provide a dimensionless measure of feature signi�cance asit is weighted by the sum of the Fourier component amplitudes, which have units lumens.However, if we normalize the local energy function by dividing by the sum of the Fourieramplitudes we obtain the phase congruency function which is dimensionless. It takes onvalues between 0 and 1, and provides an illumination and spatial magni�cation invariantmeasure of feature signi�cance.An initial approach to calculating phase congruency might be to take a signal, remove its DCcomponent, (it is removed because a 90 degree phase shift of a zero frequency does not haveany meaning) calculate the Hilbert transform (via the Fourier transform), square and sumthe Hilbert transform and the AC component of the signal, and �nally normalize the result bydividing by the sum of the Fourier amplitudes. However, there are some problems with thisapproach. The Fourier transform is not good for determining local frequency information.Windowing has to be used as the mechanism to control the scale over which phase congruencyis determined. The second di�culty is that this de�nition of phase congruency does not take4

into account the spread of frequencies that are congruent at a point. For example, a signalcontaining only one frequency component, say a sine wave, will be in perfect congruence withitself and hence have phase congruency of 1 everywhere. These issues are addressed in thefollowing sections which describe how phase congruency can be calculated using wavelets.3 Using Wavelets for Local Frequency AnalysisRecently the Wavelet Transform has become one of the methods of choice for obtaining localfrequency information. The use of wavelets for local frequency analysis was originally de-veloped by Morlet (1982) for geophysical signal analysis. It has been subsequently developedby Grossmann (1984), Meyer (1987), Daubechies (1990), Mallat (1989) and others. Thisapproach involves using a bank of �lters to analyze the signal. The �lters are all createdfrom rescalings of the one wave shape, each scaling designed to pick out particular frequen-cies of the signal being analyzed. An important feature is that the scales of the �lters varygeometrically, giving rise to a logarithmic frequency scale.We are interested in calculating local frequency and phase information in signals. To preservephase information symmetric/anti-symmetric wavelets must be used. In this paper we followthe approach of Morlet using wavelets based on complex valued Gabor functions - sine andcosine waves, each modulated by a Gaussian. Using two �lters in quadrature enables one tocalculate the amplitude and phase of the signal for a particular frequency at a given spatiallocation. It should be noted that these wavelets are not orthogonal; some conditions mustapply in order to achieve reasonable signal reconstruction after decomposition. We onlyrequire approximate reconstruction up to a scale factor over a band of frequencies or waveletscales. This can be readily achieved and will be discussed later.Odd WaveletEven WaveletOdd WaveletEven WaveletFigure 2: Gabor wavelet: a sine and cosine wave modulated by a Gaussian.Analysis of a signal is done by convolving the signal with each of the wavelets. If we letI denote the signal and M en and Mon denote the even and odd wavelets at a scale n, theamplitude of the transform at a given wavelet scale is given byAn(x) = q(I(x) �M en)2 + (I(x) �Mon)2 (5)and the phase is given by 5

-0.8-0.6-0.4-0.2

00.20.40.60.8

0 100 200 300 400 500 600

even-symmetric wavelets

0

1

2

3

4

5

6

0 50 100 150 200 250 300

spectra

0123456789

0 50 100 150 200 250 300

sum of spectra

-0.8-0.6-0.4-0.2

00.20.40.60.8

0 100 200 300 400 500 600

odd-symmetric wavelets

0

1

2

3

4

5

6

0 1 2 3 4 5 6 7 8

spectra (log w)

0123456789

0 1 2 3 4 5 6 7 8

sum of spectra (log w)Figure 3: Five wavelets and their respective Fourier Transforms indicating which sections ofthe spectrum each wavelet responds to. Collectively the wavelets provide a wide coverageof the spectrum, though with some overlap. Note that on a logarithmic frequency scale thespectra are identical. �n(x) = atan2(I(x) �M en; I(x) �Mon): (6)Note that from now on n will be used to refer to wavelet scale (previously n denoted frequencyin the Fourier series of a signal).The results of convolving a signal with a bank of wavelets can be displayed graphically via ascalogram (Figure 4). Each row of the scalogram is the result of convolving the signal with aquadrature pair of wavelets at a certain scale. Phase is plotted by mapping 0{360 degrees to0{255 grey levels. The vertical axis of the scalogram is a logarithmic frequency scale, withthe lowest frequency at the bottom. Each column of the scalogram can be considered to bea local Fourier spectrum for each point.The phase plot of the scalogram is of particular interest because it enables one to actuallysee the points of high phase congruency. At locations in the signal where there are large stepchanges one can see a vertical line of constant grey value in the phase diagram indicating aconstant phase angle over all frequencies at that point in the signal.6

Signal to be AnalyzedMagnitude of Scalogram

* * * *Phase of ScalogramFigure 4: A one dimensional signal and its amplitude and phase scalograms. The horizontalaxes of the scalograms correspond directly with the signal's horizontal axis. The vertical axesof the scalograms correspond to a logarithmic frequency scale with low frequencies at thebottom. The asterisks mark vertical lines of constant phase that occur at the step transitionsin the signal. These are points of phase congruency.4 Calculating Phase Congruency Via WaveletsTo calculate phase congruency we need to construct the following quantities: Firstly we needto remove the DC component from our signal while at the same time retaining as many of theother frequency components as possible. We will denote this signal F (x). Note, we wish toretain a broad range of frequencies in our signal because phase congruency is only of interestif it occurs over a wide range of frequencies. Secondly we have to construct H(x), the Hilberttransform of F (x), and �nally we need to calculate PAn(x), the sum of the amplitudes ofthe frequency components in F (x). Note that the sum of amplitudes becomes a function of7

x as our frequency analysis of the signal is now spatially localized.If we design our bank of wavelet �lters so that the transfer function of each �lter overlapssu�ciently with its neighbours in such a way that the sum of all the transfer functionsforms a relatively uniform coverage of the spectrum, we are in a position to reconstruct thedecomposed signal over a band of frequencies up to a scale factor. (If the transfer functionsare scaled so that their sum is always one, exact reconstruction will be achieved.) Referring toFigure 3 one can see that the sum of the spectra of the �ve wavelets produces a relatively idealband-pass �lter, especially when viewed on the log frequency scale. Design of the waveletbank ends up being a compromise between wishing to form a smooth sum of spectra whileat the same time minimizing the number of �lters used so as to minimize the computationrequirements.If we sum the results of convolving our signal with the bank of even wavelets we will recon-struct a band passed version of our signal, ampli�ed according to the scaling and overlap ofthe transfer functions of our �lters. We can use this result for F (x). H(x) can be constructedfrom the sum of the convolutions of the signal with the odd wavelets; this is a signal coveringthe same bandwidth of the original signal and ampli�ed in the same way as F (x), but shiftedin phase by 90o. Thus F (x) = Xn I(x) �M en ; and (7)H(x) = Xn I(x) �Mon: (8)The sum of the amplitudes of the frequency components in F (x) is given byXn An(x) =Xn q(I(x) �M en)2 + (I(x) �Mon)2: (9)Thus with these three components we are able to calculate phase congruency. At this pointwe slightly modify our expression for phase congruency by adding a small positive constantto the denominator. This small constant, ", prevents the expression from becoming unstablewhen PnAn(x) (and hence E(x)) becomes very small. ThusPC(x) = E(x)"+PnAn(x) (10)where E(x) = qF (x)2 +H(x)2. The appropriate value of " depends on the precision withwhich we are able to perform convolutions and other operations on our signal; it does notdepend on the signal itself. Large values of " can be used to suppress the in uence of noise,but as we shall see in the next section there is a much better way to compensate for noise.An " value of 0.01 has been used for all the results presented in this paper.Note the ampli�cation of the signal due to the scaling and overlap in the �lter transferfunctions is canceled out in the calculation of phase congruency. The use of wavelets allows us8

to obtain high spatial localization in the calculation of phase congruency. The frequency rangeover which phase congruency is calculated is easily controlled through the number of scalesused in the wavelet bank. Figure 5 illustrates all the intermediate steps in the calculation ofphase congruency on a one-dimensional signal. Note that this �gure incorporates a measureof frequency spread in the calculation of phase congruency. Frequency spread is discussed insection 7.5 NoiseA di�culty with phase congruency is its response to noise. Figure 6 illustrates the phasecongruency of a step function with and without noise. In this example the signal-to-noise ratiois 80 (here SNR is de�ned as step size=�noise). In the vicinity of the step, phase congruencyis only high at the point of the step. However, away from the step the uctuations due tonoise are considered to be signi�cant relative to the surrounding signal (which is noise). Thiswill occur no matter how small the noise level is. This is the price one pays for using anormalized measure such as phase congruency.We generally have a strong intuitive feeling as to what is, or is not, noise in a signal. Sowhat is it that makes noise noise?Trying to recognize noise by its spectral characteristics is of no use to us. The amplitudespectrum of noise is typically at, e�ectively indistinguishable from the spectral character-istics of a line (delta) feature in an image. However, we can identify the level of noise in animage if we make use of the following two observations:1. Noise is everywhere in an image and its level is generally constant.2. Features, such as edges, occur at isolated locations in an image.From this we can deduce that the response of the smallest scale �lter pair in our waveletbank will be almost entirely due to noise. Only at feature points will the response di�er fromthe background noise response, and the regions where it will be responding to features willbe small due to the small spatial extent of the wavelet. Thus the distribution of responseamplitudes from the smallest scale �lter pair across the whole image will be asymmetric. Thepeak of the distribution, representing response to noise, will be strongly skewed to the left anda long tail will extend at the upper end of the distribution as a result of responses to features.A simple mean of such a distribution will not provide us with a good estimate of the averagenoise response because the response to features will strongly bias the result. The approachthat has been taken is to use the exponent of the mean of the log of the response amplitudesas an estimate of the average response to noise. Taking the log of the response amplitudesvery e�ectively discounts the in uence of the tail at the upper end of the distribution. Thusour estimate of the mean noise response of the smallest scale �lter pair over the signal isgiven by A00 = elogA0(x) : (11)If the noise distribution is uniform the typical maximum amplitude response to noise willbe given by twice this estimated average noise response as the minimum possible response9

calculate magnitude

calculate magnitude

calculate magnitude

0

50

100

150

200

0 50 100 150 200 250 300

signal

-150-100

-500

50100150

0 50 100 150 200 250 300

convolution with even filter at scale 1

-150-100-50

050

100150

0 50 100 150 200 250 300

convolution with odd filter at scale 1

0

50

100

150

0 50 100 150 200 250 300

amplitude at scale 1

-150-100

-500

50100150

0 50 100 150 200 250 300


-150-100-50

050

100150

0 50 100 150 200 250 300


-150-100

-500

50100150

0 50 100 150 200 250 300


-150-100-50

050

100150

0 50 100 150 200 250 300


0

50

100

150

0 50 100 150 200 250 300


-600-400-200

0200400600

0 50 100 150 200 250 300

sum of convolutions with even filters

-600-400-200

0200400600

0 50 100 150 200 250 300

sum of convolutions with odd filters

0

200

400

600

800

1000

0 50 100 150 200 250 300

sum of amplitudes

0

50

100

150

0 50 100 150 200 250 300


0

50

100

150

200

0 50 100 150 200 250 300

signal

0

200

400

600

800

1000

0 50 100 150 200 250 300

local energy

0

0.25

0.5

0.75

1

0 50 100 150 200 250 300

phase congruency

0

0.25

0.5

0.75

1

0 50 100 150 200 250 300

frequency spread

sum sum sum

calculate magnitude

normalize energy by the sum of amplitudesand weight by frequency spread

calculate frequency spread

Figure 5: Steps in the calculation of phase congruency. Note: Not all intermediate convolu-tions are displayed. 10

-1

-0.5

0

0.5

1

0 50 100 150 200 250 300

step

0

0.25

0.5

0.75

1

0 50 100 150 200 250 300

phase congruency

-1

-0.5

0

0.5

1

0 50 100 150 200 250 300

noisy step

0

0.25

0.5

0.75

1

0 50 100 150 200 250 300

phase congruencyFigure 6: Phase congruency of a step function with and without noise.level will always be 0. If the noise distribution is Gaussian the typical maximum amplituderesponse will be approximately three times the estimated average noise response (the meanamplitude being an estimate of the standard deviation). It remains for us to estimate theresponse of the wavelets at other scales in our �lter bank to the noise in the image, and thenfrom this deduce the in uence of noise on our measure of phase congruency.If we assume that the frequency spectrum of the noise is at we can estimate the noiseresponse of other wavelets relative to the response of the smallest scale wavelet pair from therelative bandwidths of the wavelets. The square of a �lter pair's response amplitude in thespatial domain will be related to the power spectrum of the signal via Parseval's theorem.If we assume the noise power level is constant with frequency then the square of a �lter'sresponse amplitude will be proportional to its bandwidth. Thus, the amplitude of a �lter'sresponse to noise will be proportional to the square root of its bandwidth (and hence thesquare root of its centre frequency).wavelet 1frequency band

wavelet 3frequency band

wavelet 2frequency band

log ωω W1 W2 W3

A2

A

wavelet amplitude responsenoise power spectrum

Figure 7: The response of a 1D wavelet to noise is proportional to the square root of itsbandwidth (and hence the square root of its centre frequency) if the noise has a uniformpower spectrum.If the estimate of the mean noise response of smallest scale wavelet pair over the whole imageis given by A00, then the sum of the estimated noise responses for all the wavelet scales isobtained by summing the relative square roots of the �lter bandwidths, and scaling by A00.11

Thus the total noise in uence is estimated byT = k A00 N�1Xn=0 1pmn ; (12)whereN is the number of wavelet scales used,m is the scaling factor between successive �ltersand k is a scaling factor used to estimate the maximum amplitude of the noise response fromthe mode amplitude (typically k � 2:5).The sum of the estimated noise responses over all the wavelet scales, T , will give us an upperbound on the e�ect of noise on the sum of the wavelet response amplitudes, PnAn(x). Itis an upper bound because in general the noise responses will not all be in phase; some willcancel. This bound on the e�ect of noise on PnAn(x) is in turn an upper bound of its e�ecton the value of local energy, E(x), as PnAn(x) is always greater than, or equal to, E(x) bya triangle inequality. If we subtract this estimated noise e�ect from the local energy beforenormalizing it by the sum of the wavelet response amplitudes we will eliminate spuriousresponses to noise. Thus, we modify the expression for phase congruency to the following:PC(x) = (E(x)� T )+"+PnAn(x); (13)where ()+ denotes that the di�erence between the functions is not permitted to becomenegative.The phase congruency of a legitimate feature will be reduced according to the magnitude ofthe noise's local energy relative to the feature. Thus we end up with a measure of con�dencethat the feature is signi�cant relative to the level of noise. This approach proves to be highlye�ective. Figure 8 shows the results of processing two noisy step pro�les. In both cases a kvalue of 2 was used to estimate the maximum in uence of noise on local energy.-1.5

-1-0.5

00.5

11.5

22.5

0 50 100 150 200 250 300

-1.5-1

-0.50

0.51

1.52

2.5

0 50 100 150 200 250 300

0

0.25

0.5

0.75

1

0 50 100 150 200 250 300

phase congruency

0

0.25

0.5

0.75

1

0 50 100 150 200 250 300

phase congruencyFigure 8: Noise compensated phase congruency of two step pro�les; SNR 13.3 and 5.3respectively. 12

6 Extension to two dimensionsSo far our discussion has been limited to signals in one dimension. Calculation of phasecongruency requires the formation of a 90o phase shift of the signal which we have done usingodd-symmetric �lters. As one cannot construct rotationally symmetric odd-symmetric �ltersone is forced to analyze a two dimensional image by applying our one dimensional analysisover several orientations and combining the results in some way. There are four issues to beresolved: The shape of the �lters in two dimensions, the numbers of orientations to analyze,and the way in which the results from each orientation are combined, and the changes fornoise compensation that are required in two dimensions.6.1 2D �lter designThe one dimensional �lters described previously can be extended into two dimensions bysimply applying some spreading function across the �lter perpendicular to its orientation.The obvious spreading function to use is the Gaussian and there are good reasons for choosingit. Consider a 2D wavelet being convolved with a step edge feature that is not aligned withthe oriention of the �lter, as shown in Figure 9. If the �lter is separable the convolution canbe accomplished by a 1D convolution in the vertical direction with the spreading function,followed by a 1D convolution horizontally with the wavelet function. Since we are interestedin the phase information, the important thing to ensure is that convolution with the spreadingfunction does not corrupt the phase data in the image.+ --Figure 9: Convolution of a 2D wavelet with an angled edge.In this example, the result of convolving the image vertically with a Gaussian spreadingfunction will be to blur the edge so that on the subsequent convolution with the waveletfunction we encounter a Gaussian smoothed step. Looking in the frequency domain, anyfunction smoothed with a Gaussian su�ers amplitude modulation of its components, butphase is una�ected (the transfer function of a Gaussian is a Gaussian). On the other handif we, say, use a rectangular spreading function some phase angles in the signal would bereversed because the transfer function (a sinc function) has negative values. The step edgewould be changed to a ramp and we would perceive Mach bands along the ramp edge.13

transfer function

w

transfer function

ww

rectangular spreading function

x

Gaussian spreading function

xx

(a) (b)Figure 10:(a) A rectangular spreading function has a transfer function that reverses some phase com-ponents.(b) A Gaussian spreading function only modulates amplitudes.6.2 Filter orientationsTo detect features at all orientations our bank of �lters must be designed so that they tilethe frequency plane uniformly. In the frequency plane the �lters appear as 2D Gaussianssymmetrically or anti-symmetrically placed around the origin, depending on the spatial sym-metry of the �lters. The length to width ratio of the 2D wavelets controls their directionalselectivity. This ratio can be varied in conjunction with the number of �lter orientationsused in order to achieve an even coverage of the 2D spectrum. A length to width ratio ofone and a �lter orientation spacing of 30o has been found to be satisfactory. The �nal resultis a `rosette' of overlapping 2D Gaussians in the frequency plane (�gure 11). Simoncelli etal. (1992) describe a systematic �lter design technique for achieving uniform coverage of thefrequency plane that could be applied here.Figure 11: Tiling of the frequency plane with oriented �lters at di�erent scales.14

6.3 Noise compensation in two dimensionsWhen compensating for noise in two dimensions one has to consider the spatial width ofthe �lters in addition to their bandwidths. The spatial width of a �lter a�ects its noiseresponse. In the design of a �lter rosette, the spatial width of �lters are varied with theircentre frequency in order to maintain constant directional sensitivity. Viewed in the frequencyplane the angular `width' of the �lters is proportional to their bandwidth. Thus, in 2D, thesquare of the amplitude response in the spatial domain will be proportional to the square ofthe bandwidth multiplied by the power level of the noise. That is, the relative amplitudes ofthe �lter responses to noise will be directly proportional to their bandwidth. Accordingly thenoise compensation term, T now becomesT = k A00 N�1Xn=0 1mn= k A00 1� ( 1m)N1 � 1m ; (14)where A00 = elogA0(x;y) forms our estimate of the mean noise response of the smallest scale 2D�lter pair over the image.6.4 Combining data over several orientationsThe important issue here is to ensure that features at all possible orientations are treatedequally, and all possible conjunctions of features, such as corners and `T' junctions, aretreated uniformly. Indeed, it is important to avoid making assumptions about the 2D formof features that one may encounter.It is important that the normalization of energy to form phase congruency is done aftersumming energies over all orientations. We want the �nal result to be a weighted normalizedvalue, with the result from each orientation contributing to the �nal result in proportion toits energy. If energy was normalized in each orientation prior to being summed, then thecontribution from each orientation would be independent of its energy, clearly an undesirablesituation.The approach that has been adopted is as follows: At each location in the image we calculateenergy, E(x) in each orientation, compensate for the in uence of noise, and then form thesum over all orientations. This sum of energies is then normalized by dividing by the sumover all orientations and scales of the amplitudes of the individual wavelet responses at thatlocation in the image. This produces the following equation for 2D phase congruency:PC(x) = Po(Eo(x)� To)+"+PoPnAno(x) (15)where o denotes the index over orientations.On a straight edge segment, only �lters with orientations roughly perpendicular to the edgewill have signi�cant output; responses from other orientations will be negligible and the15

equation above will reduce to the 1D expression for phase congruency developed earlier. Atmore complex feature types such a corners and `T' junctions, several orientations will haveenergy at signi�cant levels. However, the sum of the amplitudes of the individual waveletresponses will also increase, maintaining the appropriate normalization.Notice in the equation above that noise compensation is performed in each orientation in-dependently. This has been found to give signi�cantly better results. The perceived noisecontent as deduced from the average response of the smallest scale wavelet pair can varysigni�cantly with orientation. This is due to the correlation in noise along scan lines that isoften observed, and other processes that can occur in the digitization of an image.7 The importance of frequency spreadClearly a point of phase congruency is only signi�cant if it occurs over a wide range offrequencies. As an extreme example of a signal with a narrow frequency spread consider apure sine wave: its Hilbert transform will be a cosine, the sum of the squares of these twosignals will be one everywhere and thus phase congruency will be one everywhere. A morecommon situation is where a feature has undergone Gaussian smoothing, either intentionallyor through image degradation. For these smoothed functions phase congruency can be highover extended regions of the signal resulting in poor localization. The Gaussian smoothingreduces the high frequency components in the signal and accordingly reduces the frequencyspread. In the extreme, the frequency spread is reduced so much that locally we approachthe situation that arises with pure sine functions.0

0.250.5

0.751

0 50 100 150 200 250 300

Gaussian smoothed step

0

0.25

0.5

0.75

1

0 50 100 150 200 250 300

phase congruency (including low frequencies)

0

0.25

0.5

0.75

1

0 50 100 150 200 250 300

phase congruency

(a) (b) (c)Figure 12: Phase congruency of a Gaussian smoothed step pro�le.(a) Smoothed step pro�le.(b) Phase congruency considering wavelengths up to 65 units. The undulations in phasecongruency are due to numerical e�ects.(c) Phase congruency considering wavelengths up to 150 units.From Figure 12 we can see that to obtain good localization it is important to incorporate lowfrequency components in the calculation of phase congruency on smoothed pro�les. Theselow frequency components are the least a�ected by the smoothing of the signal.Thus we are concerned with two conditions: one is where there is a very narrow range offrequencies present causing the values of energy and PnAn to become equal, making phasecongruency one. The other is where only a limited range of frequencies are considered asin Figure 12(b) and one encounters no signi�cant frequency content at all. In this situation16

both energy and PnAn become very small, making the calculation of phase congruencypoorly conditioned. This second situation is handled through the use of the parameter ", inthe denominator of the expression for phase congruency.Thus, as a measure of feature signi�cance phase congruency should be weighted by somemeasure of the spread of frequencies present. What then is a signi�cant distribution offrequencies? If we consider some common feature types such as the square waveform (stepedge), the triangular waveform (roof edge) and the delta function (line feature) as some ofthe `edgiest' waveforms imaginable we can use their frequency distributions as a guide to theideal.The power spectrum of a square wave is of the form 1=!2. Each of the wavelets that weuse to analyze the signal gathers power from geometrically increasing bands of the spectrum.On a spectrum of this shape each wavelet will gather an amount of power that is inverselyproportional to its bandwidth. (The power level is inversely proportional to the centre fre-quency squared, but the bandwidth over which the power is gathered is proportional to centrefrequency). In a manner analogous to our analysis of �lter amplitude responses to noise wecan now deduce the expected distribution of �lter amplitude responses to a square waveform.However, unlike the situation with noise, a �lter's response to a step feature is localized inspace. The spatial width of the response being dictated by the spatial width of the �lter,which is inversely proportional to its bandwidth. So for each �lter, while the power it cap-tures falls with bandwidth, the spatial width over which that power is expressed in terms ofthe square of the amplitude response, also falls with bandwidth. Therefore, to preserve theexpected power gathered by each �lter the magnitudes of the squared amplitude responsehave to remain constant, independent of �lter centre frequency. Hence, the expected distri-bution of amplitude responses to a step feature will be a uniform one. This is illustrated in�gure 13. Field (1987) points out that in many cases images of natural scenes have overallspectral distributions that fall o� inversely proportional to frequency, and for this reason healso advocates the use of geometrically scaled �lter banks. Under these conditions �lters atall scales will, on average, be responding with equal magnitudes. This is likely to maximizethe precision of any computation (numerical or neural) that we make with the �lter outputs.It should be noted that the Gaussian smoothing of a feature that arises whenever a 2D �lterhaving a Gaussian spreading function encounters a feature at some non-orthogonal angle (seeFigure 9) does not signi�cantly change the frequency distribution relative to what would beexpected from an unsmoothed feature. The reason for this is that the degree of Gaussiansmoothing, as seen by each �lter, varies directly in proportion with the scale of each �lter dueto its �xed length/width ratio. Consider the convolution process in the frequency domain:A �lter at some given centre frequency will see a feature spectrum that has been modulatedby a Gaussian of some spread. A second �lter tuned to a frequency of twice the �rst willsee a feature spectrum that has been modulated by a Gaussian of twice the spread (spatially,its spreading function is a Gaussian of half the width of the �rst). The relative modulationof the feature spectrum as seen by each individual �lter over its bandwidth, will thereforeremain roughly constant. Thus, the overall distribution of �lter responses, while modulated,will remain largely unchanged.The other important feature types we must consider are the delta function (corresponding toline features) and roof edges. The power spectrum of a delta function is uniform. Followingan analysis similar to that done for the square waveform one can show that for a delta17

step function

10 0.125 0.25 0.375 0.5

power spectrum of step

0

0.5

1

1.5

2

2.5amplitude response at scale 1

1

0 0.125 0.25 0.375 0.5

transfer function at scale 1

0

0.5

1

1.5

2


1

0 0.125 0.25 0.375 0.5

transfer function at scale 2

0

0.5

1

1.5

2


1

0 0.125 0.25 0.375 0.5

transfer function at scale 3Figure 13: On a step feature, wavelet pairs at all scales will respond with the same amplitude.feature the amplitude of the wavelet �lter responses will be proportional to their bandwidths,and hence their centre frequencies. This will give a distribution of �lter responses stronglyskewed to the high frequency end. On the other hand, for a triangular waveform where all thefeatures are roof edges, the power spectrum falls o� at 1=!4. When considered in terms of theamplitude of �lter responses on a logarithmic frequency scale this will produce a distributionstrongly skewed to the low frequency end.Thus, the di�culty we face here is that there is no one ideal distribution of �lter responses.Given the very di�erent types of distributions expected for delta and roof edges, we cannoteven say the ideal distribution should be a function of the phase angle at which the congruencyoccurs. All we can say is that the distribution of �lter responses should not be too narrow insome general sense. We can also say that a uniform distribution is of particular signi�canceas step discontinuities are highly common in images.Accordingly we can construct a weighting function that devalues phase congruency at loc-ations where the spread of �lter responses is narrow. A measure of �lter response spreadcan be generated by taking the sum of the amplitudes of the responses and dividing by thehighest individual response to obtain some notional `width' of the distribution. If this is thennormalized by the number of scales being used, we obtain a fractional measure of spread that18

varies between 0 and 1. This spread is given bys(x) = 1N PnAn(x)"+Amax(x)! ; (16)where N is the total number of scales being considered, Amax(x) is the amplitude of the �lterpair having maximum response at x, and " is used to avoid division by zero and to discountthe result should both PAn(x) and Amax(x) be very small.A phase congruency weighting function can then be constructed by applying a sigmoid func-tion to the �lter response spread value, namelyW (x) = 11 + eg(c�s(x)) ; (17)where c is the `cut-o�' value of �lter response spread below which phase congruency valuesbecome penalized, and g is a gain factor that controls the sharpness of the cut-o�. Typicalvalues for c and g are 0.4 and 10 respectively. Note the sigmoid function has been merelychosen for its simplicity and ease of manipulation.In 2D, the weighting of phase congruency by frequency spread at each point in an imagehas to be done in each orientation independently. Thus the weighting factor is applied tothe energy in each orientation prior to summing over all orientations and normalizing by thetotal sum of the �lter response amplitudes. We modify equation 15 to produce the following:PC(x) = Po(Wo(x)(Eo(x)� To)+)"+PoPnAno(x) : (18)Figure 5 provides an example of how the frequency spread weighting function varies across a1D signal. Weighting by frequency spread, as well as reducing spurious responses where thefrequency spread is low, has the additional bene�t of sharpening the localization of features,especially those that have been smoothed. Referring to Figure 5, it is of interest to notethat peaks in E(x), PnAn(x), and to a lesser extent, W (x) are all strong indicators of thepresence of a feature, though it is phase congruency that provides a dimensionless quantitythat is invariant to contrast.8 Scale via high-pass �lteringThe traditional approach to analyzing an image at di�erent scales is to consider various low-pass or band-passed versions of the image. Versions of the image having only low frequenciesleft are considered to contain the `broad scale' features. This approach is inspired from thepresence of receptive �elds in the visual cortex that act as band-pass �lters (Marr 1982).While this approach is intuitive, the justi�cation for assuming the brain uses these band-passed versions of the image directly for multi-scale analysis is perhaps somewhat circular.On being presented with a low-pass version of an image one is asked \What features do you19

see?" Of course you see the `broad scale' features - they are the only things left in the imageto be seen.A major problem with the use of low or band-pass �ltering for multi-scale analysis is that thenumber of features present in an image, and their locations, vary with the scale used. It seemsvery unsatisfactory for the location of a feature to depend on the scale at which it is analyzed.This is a problem for di�erential based feature detection schemes that require smoothing tosuppress the in uence of noise. It also creates di�culties for applications that make use ofcoarse-to-�ne strategies for feature matching. Feature positions drift with scale, localizationat broad scales is poor and the number of features proliferates as scale is reduced. Coarse-to-�ne strategies would be greatly facilitated if only the relative signi�cance of features variedwith scale, with the number of features and their locations remaining stable.The use of phase congruency to measure feature signi�cance allows one to consider an al-ternative interpretation of feature scale. Phase congruency at some point in a signal dependson how the feature is built up from the local frequency components. Depending on the sizeof the analysis window, features some distance from the point of interest may contribute tothe local frequency components considered to be present. Thus features are not consideredin isolation but in context with their surrounding features.Therefore, as far as phase congruency is concerned, the natural scale parameter to vary is thesize of the window in the image over which we perform the local frequency analysis. In thecontext of our use of wavelets to calculate phase congruency the scale of analysis is speci�edby the spatial extent of the largest �lter in the wavelet bank. (Filters in the wavelet bankrange from some minimum spatial scale, say corresponding to the Nyquist frequency, up toa scale matching the window size). With this approach we are using high-pass �ltering tospecify the analysis scale. We cut out low frequency components (those having wavelengthslarger than the window size) while leaving the high frequency components intact.If we use a small analysis window, each feature will be treated with a great degree of inde-pendence from other features in the image. We will only be comparing each feature to a smallnumber of other features that are nearby, and hence each feature is likely to be perceivedmore important locally. At the largest scale (window size equal to image size) each featureis considered in relation to all other features, and we obtain a sense of global signi�cance foreach feature. Something that may have high phase congruency/signi�cance when analyzedover a small window may end up having a low measure of phase congruency if we look at itwithin a larger analysis window and consider the phase of the lower frequencies present. Afeature having high phase congruency when analyzed over a wide range of spatial scales fromsmall to large is clearly of greater signi�cance than one having congruency over a limitednumber of small spatial scales.It should be noted that our original ideal that the signi�cance of image features should beinvariant to image magni�cation is not really attainable. In practice we have to compute phasecongruency using a �nite number of spatial �lters that cover a limited range of the spectrum.Changing the magni�cation of an image may alter the relative responses of individual �ltersand hence change the perceived phase congruency. However, the changes in measured phasecongruency will, in general, be much smaller than the corresponding changes in intensitygradient. 20

In summary, it is proposed that multi-scale analysis be done by considering phase congruencyof di�ering high-passed versions of an image. The high-pass images are constructed from thesum of band-passed images, with the sum ranging from the highest frequency band down tosome cut-o� frequency. With this approach, no matter what scale we consider, all featuresare localized precisely and in a stable manner. There is no `drift' of features that occurs withlow-pass �ltering. All that changes with analysis at di�erent scales is the relative signi�canceof features. This provides an ideal environment for the use of coarse-to-�ne strategies. Figure14 illustrates a one-dimensional signal at three di�erent scales of band-pass and high-pass�ltering, along with phase congruency at the three high-pass scales.

0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

signal signal signal

(a) band-pass (b) high-pass (c) phase congruencyFigure 14:(a) Three di�erent band-passed versions of a 1D signal. Note that the number and locationsof features varies.(b) Three di�erent high-pass versions of the signal.(c) Phase congruency of the signal at the three scales of high-pass �ltering shown in (b). Notethat the number and location of features remains constant and only their relative signi�cancevaries.9 Experimental ResultsA problem in discussing the performance of a feature detector is devising a sensible form ofevaluation. Performance criteria have been used by a number of researchers to design edgeoperators, notably Canny (1983, 1986), Spacek (1985) and Deriche (1987). These criteria21

generally measure the ability of a detector to produce a distinct local maximum at the pointof a step discontinuity in the presence of noise. However these criteria are limited in theirusefulness. They are only concerned with speci�c feature types, usually step discontinuities,and they are not concerned with the absolute value of the resulting maxima in the detector'soutput. They provide no guide as to one's ability to set general thresholds. A feature detectoris of limited use if one does not know in advance what level of response corresponds to asigni�cant feature.One of the primary motivations for using phase congruency to detect image features is that itprovides an absolute measure of the signi�cance of features. This allows one to set thresholdsthat are applicable across wide classes of images. The other motivation for detecting featureson the basis of phase congruency is that we are not required to make any assumptions aboutthe luminance pro�le of the feature; we are simply looking for points where there is order inthe frequency domain. Step discontinuities, lines and roof edges are all detected.Figure 15 illustrates some results on a test image containing line features and step discon-tinuities of various magnitudes and orientations. For comparison, the output of the Cannydetector is also presented (Canny 1983). The implementation of the Canny detector used herefollows the modi�cations suggested by Fleck (1992a). The raw, gradient magnitude imageis displayed so that comparison can be made without having to consider any artifacts thatmay be introduced by non-maximal suppression and thresholding processes. The purpose ofproviding this comparison is to illustrate some of the qualitative di�erences in performancebetween the two detectors. Quantitative comparisons are not appropriate as the design ob-jectives of the two detectors are completely di�erent. One is seeking to localize step edgesand the other is seeking to identify points of phase congruency.The main qualitative di�erence between the two detectors is the wide range of response valuesfrom the Canny detector. For example, with the Canny detector, the low contrast square inthe circle at the top right of the image almost disappears, whereas under phase congruencyit is marked prominently. This wide range of responses from the Canny detector makesthreshold selection di�cult. The other obvious di�erence is that the Canny detector producesresponses on each side of line features, whereas the phase congruency detector produces aresponse centred on the line (this problem was recognized by Canny and he designed aseparate operator to detect line features). One problem with the phase congruency detector(or at least this implementation of it) is its behaviour at junctions of features having greatlydi�erent magnitudes. Notice how the horizontal edges in the grey scale on the left hand side ofthe image fade as they meet the strong vertical edge of the grey scale. At the junction betweenthe low magnitude horizontal edges and the high magnitude vertical edge the normalizingcomponent of phase congruency, PnAn, is dominated by the magnitude of the vertical edge.Thus, at the junction, the signi�cance of the horizontal edges relative to the vertical oneis downgraded. Possibly this problem could be overcome through a di�erent approach tocombining phase congruency information over several orientations. Freeman (1992), in hisuse of a normalized form of energy for feature detection, encountered this same problemat junctions of varying contrast. His solution was to normalize energy in each orientationindependently using only energy responses from �lters in the same orientation. The energyvalues used for the normalization were blurred spatially in a direction perpendicular to theorientation being considered. A variation of this approach might usefully be applied here.Robbins and Owens (1994) also provide a detailed study of the detection of 2D features, such22

as junctions and corners, using local energy.(a) (b)(c) (d)Figure 15: Results on a test image.(a) Test image.(b) Raw output of phase congruency.(c) Raw output of Canny detector (standard deviation of smoothing �lter = 1).(d) Phase congruency after non-maximal suppression and hysteresis thresholding betweenphase congruency values of 0.5 to 0.3.Figure 16 illustrates the output on two real images. Notice how, on the face image, the noseand the cheek features have been detected. These features are e�ectively broad roof edges.The use of low frequency components in the calculation of phase congruency contributegreatly to the detection of such features.In all these examples a wavelet bank over 6 scales was used, the smallest �lters having aperiod of 3 pixels, and the scaling factor between successive wavelets being 1.5. Filters were23

Figure 16: Features detected via phase congruency on two real images.constructed directly in the frequency domain as 2D Gaussians in the frequency plane. Foreach �lter the ratio between the centre frequency and the Gaussian standard deviation wasset at 3 (see the appendix for implementation details). The wavelets had a length to widthratio of one and their orientational spacing was 30o. The �nal output was non-maximallysuppressed and hysteresis thresholding applied between phase congruency values of 0.5 to0.3. A noise compensation k value of 2.5 was used, and " was set at 0.01. For the frequencyspread weighting function the cut-o� fraction c, was set at 0.4 and the gain parameter g, wasset at 10. These parameters have been found to give reasonable results over a wide range ofimages. 24

10 ConclusionPhase congruency is a dimensionless measure of feature signi�cance. It provides an absolutemeasure of the signi�cance of feature points in an image that allows constant threshold valuesto be applied across wide classes of images. Thresholds can be speci�ed in advance; they donot have to be determined empirically for individual images.This paper presents a practical implementation for the calculation of phase congruency intwo-dimensional images using wavelets. It is shown that for a normalized measure of featuresigni�cance, such as phase congruency, it is crucial to be able to recognize the level of noisein an image and to compensate for it. A highly e�ective method of compensation is presentedthat only requires that noise be roughly constant across the image and that its power spectrumbe approximately constant.Also presented is the importance of weighting phase congruency by some measure of thespread of the frequencies that are present at each point in an image. This prevents falsepositives being marked where the frequency spread is very narrow. It also improves thelocalization of features. While it is not possible to specify one ideal distribution of �lterresponse amplitudes with frequency, it is shown that when geometrically scaled �lters areused, a uniform distribution of responses is a particularly signi�cant one. This distributionmatches typical spectral statistics of images and corresponds to the distribution that arisesat step discontinuities.Another contribution of this work is to o�er a new approach to the concept of scale in imageanalysis. The natural scale parameter to vary in the calculation of phase congruency is thesize of the analysis window over which to calculate local frequency information. Thus, underthese conditions, scale is varied using high-pass �ltering rather than low-pass or band-pass�ltering. The signi�cant advantage of this approach is that feature locations remain constantover scale, and only their signi�cance relative to each other varies.While the motivation for detecting phase congruency in images is inspired from psychophys-ical results (Morrone et al 1986, Morrone and Burr 1988) one can only speculate on thebiological plausibility of the method of calculation presented here. Despite the use of geo-metrically scaled �lters in quadrature the implementation presented here is not biologicallyplausible in its present form as it requires intermediate convolution results to represent ar-bitrarily large positive and negative values. Physical cells eventually saturate in their output,and are generally considered only to be able to represent some kind of recti�ed response.Despite this, it is of interest to compare the general form of the expression for phase con-gruency (equation 18) with the form of some of the normalized contrast sensitivity modelsthat have been recently developed to model psychophysical data. For example, some of thesimilarities between this work and the models proposed by Heeger (1992a, 1992b) are verystriking. Further comparison and analysis of these models may prove to be very productive.AcknowledgmentThe author would like to acknowledge many useful discussions with John Ross, James Trev-elyan, Robyn Owens, Ben Robbins, Chris Pudney and Adrian Baddeley. Ben Robbins also25

pointed out the e�ciencies that can be made in the Fourier convolution of an image with aquadrature pair of �lters. The Robotics and Vision Research Group in the Department ofComputer Science at The University of Western Australia acknowledge the support receivedfrom Digital through their External Research Programme.References[1] E. H. Adelson and J. R. Bergen. Spatiotemporal energy models for the perception ofmotion. Journal of the Optical Society of America A, 2(2):284{299, 1985.[2] J. F. Canny. Finding edges and lines in images. Master's thesis, MIT. AI Lab. TR-720,1983.[3] J. F. Canny. A computational approach to edge detection. IEEE Trans. Pattern Analysisand Machine Intelligence, 8:679{698, 1986.[4] I. Daubechies. The wavelet transform, time-frequency localization and signal analysis.IEEE Transactions on Information Theory, 36(5):961{1005, September 1990.[5] Rachid Deriche. Using Canny's criteria to derive an optimal edge detector recursivelyimplemented. The International Journal of Computer Vision, 1:167{187, April 1987.[6] D. J. Field. Relations between the statistics of natural images and the response prop-erties of cortical cells. Journal of The Optical Society of America A, 4(12):2379{2394,December 1987.[7] Margaret M. Fleck. Multiple widths yield reliable �nite di�erences. IEEE T-PAMI,14(4):412{429, April 1992.[8] Margaret M. Fleck. Some defects in �nite-di�erence edge �nders. IEEE T-PAMI,14(3):337{345, March 1992.[9] W. T. Freeman. Steerable Filters and Local Analysis of Image Structure. PhD thesis,MIT Media Lab. TR-190, June 1992.[10] A. Grossmann and J. Morlet. Decomposition of hardy functions into square integrablewavelets of constant shape. SIAM Journal of Mathematical Analysis, 15(4):723{736,July 1984.[11] D. J. Heeger. Optical ow from spatiotemporal �lters. In Proceedings, 1st InternationalConference on Computer Vision, pages 181{190, June 1987. London.[12] D. J. Heeger. Optical ow using spatiotemporal �lters. International Journal of Com-puter Vision, 1:279{302, 1988.[13] D. J. Heeger. Half-squaring in responses of cat striate cortex. Visual Neuroscience,9:427{443, 1992.[14] D. J. Heeger. Normalization of cell responses in cat striate cortex. Visual Neuroscience,9:181{197, 1992. 26

[15] P. D. Kovesi. A dimensionless measure of edge signi�cance. In The Australian PatternRecognition Society, Conference on Digital Image Computing: Techniques and Applic-ations, pages 281{288, 4-6 December 1991. Melbourne.[16] P. D. Kovesi. A dimensionless measure of edge signi�cance from phase congruency cal-culated via wavelets. In First New Zealand Conference on Image and Vision Computing,pages 87{94, August 1993. Auckland.[17] M. K. Kundu and S. K. Pal. Thresholding for edge detection using human psychovisualphenomena. Pattern Recognition Letters, 4:433{411, 1986.[18] S. Mallat. A theory for multiresolution signal decomposition: The wavelet representa-tion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7):674{693,July 1989.[19] D. Marr. Vision. Freeman, 1982.[20] D. Marr and E. C. Hildreth. Theory of edge detection. Proceedings of the Royal Society,London B, 207:187{217, 1980.[21] Y. Meyer. Orthonormal wavelets. In Wavelets, Time-Frequency Methods and PhaseSpace, pages 21{37. Springer Verlag 1989, December 1987. Marseille.[22] J. Morlet, G. Arens, E. Fourgeau, and D. Giard. Wave propagation and sampling theory- Part II: Sampling theory and complex waves. Geophysics, 47(2):222{236, February1982.[23] M. C. Morrone and D. C. Burr. Feature detection in human vision: A phase-dependentenergy model. Proc. R. Soc. Lond. B, 235:221{245, 1988.[24] M. C. Morrone and R. A. Owens. Feature detection from local energy. Pattern Recog-nition Letters, 6:303{313, 1987.[25] M. C. Morrone, J. R. Ross, D. C. Burr, and R. A. Owens. Mach bands are phasedependent. Nature, 324(6094):250{253, November 1986.[26] R. A. Owens. Feature-free images. Pattern Recognition Letters, 15:35{44, 1994.[27] R. A. Owens, S. Venkatesh, and J. Ross. Edge detection is a projection. PatternRecognition Letters, 9:223{244, 1989.[28] P. Perona and J. Malik. Detecting and localizing edges composed of steps, peaks androofs. In Proceedings of 3rd International Conference on Computer Vision, pages 52{57,1990. Osaka.[29] K. K. Pringle. Visual perception by a computer. In A. Grasselli, editor, AutomaticInterpretation and Classi�cation of Images, pages 277{284. Academic Press, New York,1969.[30] Ben Robbins and Robyn Owens. The 2D local energy model. Technical Report 94/5,Department of Computer Science, The University of Western Australia, August 1994.27

[31] Eero P. Simoncelli, William T. Freeman, Edward H. Adelson, and David J. Heeger.Shiftable multiscale transforms. IEEE Transactions on Information Theory, 38(2):587{607, March 1992.[32] Libor A. Spacek. The Detection of Contours and their Visual Motion. PhD thesis,University of Essex at Colchester, December 1985.[33] S. Venkatesh and R. Owens. On the classi�cation of image features. Pattern RecognitionLetters, 11:339{349, 1990.[34] S. Venkatesh and R. A. Owens. An energy feature detection scheme. In The InternationalConference on Image Processing, pages 553{557, 1989. Singapore.[35] A. B. Watson and A. J. Ahumada. Model of human visual-motion sensing. Journal ofthe Optical Society of America A, 2(2):322{341, 1985.

28

A Implementation detailsThe calculation of phase congruency in an image is computationally expensive. This sectiondescribes some techniques that reduce the computation requirements signi�cantly. A numberof other details that are important in the implementation are also covered.The main computational load in calculating phase congruency is the need to perform con-volutions with quadrature pairs of �lters over many scales and orientations. In particular,convolutions with �lters of very large spatial extent are required. For these reasons convolu-tion via the FFT is perhaps the only practical approach. Filters are constructed directly infrequency space as 2D Gaussian ellipses in the frequency plane. The orientational selectivityof the �lter is speci�ed by the ratio of the standard deviations of the �lter in the radial andtangential directions. In the results presented in this paper a ratio of 1 has been used. Tosatisfy the wavelet requirement that �lters have a constant shape ratio, the ratio between thecentre frequency, f and the radial standard deviation, � of the Gaussian ellipses are keptconstant. Successive values of centre frequency are scaled geometrically. It is very importantthat the �lters have absolutely no DC component. Two measures are used to ensure this:�rstly the ratio of f=� is kept at 3 or larger, and secondly, after constructing the �lter themagnitude of the residual zero frequency component is checked and a 2D Gaussian of thatamplitude, centred over the zero frequency point is subtracted from the �lter. The prac-tical range of �lter frequencies that can be constructed is limited. When constructed in thefrequency domain low frequency �lters su�er from quantization e�ects, and high frequency�lters su�er from truncation at the Nyquist frequency. In practice, the maximum frequenciesthat can be used e�ectively correspond to �lters having wavelengths of no less than about 3or 4 pixels.A major gain in computational e�ciency can be made by combining the otherwise twoseparate convolutions between the image and a quadrature pair �lters, into the one mul-tiplication in the frequency domain. Spatially, a quadrature pair of �lters are real/even-symmetric and real/odd-symmetric. In the frequency domain they are real/even-symmetricand imaginary/odd-symmetric respectively. If the imaginary/odd-symmetric �lter is mul-tiplied by i so that it becomes real and odd-symmetric, its inverse transform will becomeimaginary and odd-symmetric. Exploiting the linear properties of the Fourier transform wecan add the real/even-symmetric and real/odd-symmetric �lters together to produce one �l-ter and then multiply this by the Fourier Transform of the image. After taking the inversetransform the individual results of the two convolutions are easily extracted due to the evenand odd-symmetric properties of the �lters. The convolution with the even-symmetric �lterwill be found in the real component of the result, and the imaginary component will containthe convolution with the odd-symmetric �lter. A further, signi�cant, e�ciency can be gainedby only performing complex multiplications between the Fourier transform of the image andthe �lter within, say, 4 standard deviations from the centre frequency of the �lter. Note alsothat in adding the real/even-symmetric and real/odd-symmetric �lters together half of the�lter cancels out.Non-maximal suppression of the phase congruency image to produce a feature map also de-serves some comment. Before we can apply a non-maximal suppression process to the imagewe have to determine the appropriate orientation, at each point in the image, in which totest for local maxima. For gradient based edge detectors this orientation is supplied by the29

direction of the local intensity gradient. However, in a phase congruency image we have noequivalent quantity that we can use for this purpose. Instead, the �lter orientation havingthe maximum local energy at each point in the image is used. This produces a somewhatquantized orientation map. A more accurate map can be obtained by applying parabolicinterpolation to the energy values across the orientation that has maximum energy and itstwo neighbouring orientations, and then �nding the orientation that maximizes energy on the�tted parabola. In practice this extra precision does not change the results signi�cantly. Thenon-maximal suppression process is done by taking each pixel in the image and calculatingideal pixel locations at a radius of 1 each side of the pixel in the required direction. Thephase congruency values at each of these ideal locations are then estimated from the phasecongruency values at their four surrounding, integer pixel locations, using bilinear interpol-ation. These values are then tested against the centre pixel value to decide whether it is alocal maximum. While this approach to non-maximal suppression produces reasonable res-ults there is room for considerable improvement. It is probably fair to say that the problemof non-maximal suppression deserves more attention than it has received so far.

30

Th e Univ ers it y of W e st er n A ustralia 95/4 · Im age F e a t ure s rom Ph as e Con gruency P...

Documents

Transcript of Th e Univ ers it y of W e st er n A ustralia 95/4 · Im age F e a t ure s rom Ph as e Con gruency P...