MESH MODELING, RECONSTRUCTION AND SPATIO-TEMPORAL … · My thanks go to my teachers Mrs. Olas...

MESH MODELING, RECONSTRUCTION AND SPATIO-TEMPORAL

PROCESSING OF MEDICAL IMAGES

BY

JOVAN G. BRANKOV

Submitted in partial fulfillment of therequirements for the degree of

Doctor of Philosophy of Electrical Engineeringin the Graduate College of theIllinois Institute of Technology

Approved

Adviser

Co-adviser

Chicago, IllinoisDecember 2002

ii

COPYRIGHT BY

JOVAN G. BRANKOV

2002

iii

ACKNOWLEDGMENT

I want to thank my adviser Dr. Miles Wernick for introducing me to this field, for all

his help, and for believing in me.

My sincere thanks go to my co-advisor Dr. Yang Yongyi and also to Dr. Nikolas

Galatsanos for valuable discussions.

I would like to dedicate special thanks to my parents for their full support throughout

my education and to my sister Elvira for her encouragement and advice.

My thanks go to my teachers Mrs. Olas Gisela, Mrs. Jasmina Kozomora and Dr.

Zoran Dobrosavljevic, key people in my early academic growth, and thanks to all my

friends for encouraging me and sharing with me both the sadness and the joy.

I also wish to acknowledge the National Institutes of Health (HL65245) and the

Whitaker Foundation for their financial support of the project.

J. G. B

iv

TABLE OF CONTENTS Page

ACKNOWLEDGMENT.................................................................................................... iii

LIST OF FIGURES ........................................................................................................... vi

LIST OF TABLES...............................................................................................................x

CHAPTER

1. INTRODUCTION .................................................................................................. 1

1.1. The Need for Noise Reduction in Nuclear Medicine ........................ 21.2. Tomography by SPECT and PET...................................................... 41.3. PET and SPECT Studies based on Image Sequences........................ 71.4. Content-Adaptive Mesh Modeling .................................................... 91.5. Novel Tomography Image Reconstruction...................................... 101.6. Similar Component Analysis ........................................................... 161.7. CAMM Extensions .......................................................................... 17

2. CONTENT ADAPTIVE MESH MODELING .................................................... 18

2.1. Scalar 2D Function Mesh Modeling................................................ 182.2. Scalar Volumetric (3D) Function Mesh Modeling .......................... 442.3. Vector Valued 2D Function Mesh Modeling .................................. 56

3. MESH MODELING FOR TOMOGRAPHIC IMAGERECONSTRUCTION ................................................................................... 66

3.1. CAMM based Reconstruction ......................................................... 663.2. Mesh Modeling Framework for Image Reconstruction................... 683.3. Image Reconstruction Algorithms using Mesh Modeling............... 713.4. Slice (2D) CAMM Reconstruction.................................................. 753.5. Volumetric (3D) CAMM Reconstruction........................................ 91

4. DUAL MODALITY TOMOGRAPHIC IMAGE RECONSTRUCTION............ 97

4.1. Dual Modality Mesh Modeling ....................................................... 974.2. Introduction...................................................................................... 974.3. Dual-Modality Mesh Generation ..................................................... 994.4. Statistical Image Reconstruction ................................................... 1004.5. Experimental Results ..................................................................... 1014.6. Discussion...................................................................................... 104

5. SPATIAL-TEMPORAL IMAGE SEQUENCE PROCESSING........................ 107

5.1. Introduction.................................................................................... 1075.2. Deformable Mesh Modeling Approach ......................................... 109

v

CHAPTER Page

5.3. Deformable Mesh Modeling for Partial Volume Affected Myocardium................................................................... 125

5.4. Similarity Clustering Analysis....................................................... 131

6. EXTENSION TO OTHER IMAGE PROCESSING PROBLEMS.................... 145

6.1. Image Denoising by Mesh Model Filtering................................... 1456.2. Geometric Watermarking .............................................................. 148

7. CONCLUSION................................................................................................... 162

APPENDIX

A. MESH MODEL ERROR BOUND FOR SCALAR 2D FUNCTIONREPRESENTATION ......................................................................... 164

B. INTERPOLATION ERROR AND ITS GRADIENT WITH RESPECT TONODAL POSITION........................................................................... 168

C. MESH MODEL ERROR BOUND FOR SCALAR 3D FUNCTIONREPRESENTATION ......................................................................... 174

D. MODIFICATION ON ERROR DIFFUSION KERNEL .......................... 178

E. NUMERICAL EVALUATION OVER MESH ELEMENTS................... 182

F. MATCHING ERROR GRADIENT CALCULATION............................. 185

G. MODIFIED MATCHING CRITERIAGRADIENT CALCULATION .......................................................... 189

H. ML PARAMETER ESTIMATION FORSIMILARITY CLUSTERING ........................................................... 191

REFERENCES ................................................................................................................192

vi

LIST OF FIGURES

Figure Page

1. Single Photon Emission Computed Tomography....................................................5

2. Coincidence Detection. ............................................................................................6

3. Gating by Electrocardiography................................................................................9

4. Classification of Tomographic Image Reconstruction Algorithms. ......................12

5. Proposed Mesh-Generation Procedure. .................................................................22

6. Detail of The Image processed by Error Diffusion................................................26

7. Example of Delaunaly Ttriangulation....................................................................28

8. Transformation of The Interpolation over The Irregular Element.........................30

9. Quadtree Mesh Generation Procedure. ..................................................................35

10. A 128 128× Section of Original Image “Lena”.....................................................39

11. Mesh Structure .......................................................................................................40

12. Cardiac Image ........................................................................................................41

13. Mesh Obtained using Quadtree Method ................................................................42

14. Histogram Plot of The Peak Error .........................................................................43

15. Error Diffusion.......................................................................................................48

16. gMCAT D1.01 Phantom in Vicinity of The Heart ................................................52

17. 3D Mesh Model .....................................................................................................53

18. Results of Interpolation..........................................................................................54

19. Results Obtained by Least Square Fit....................................................................55

20. Vectored Valued Mesh ..........................................................................................63

21. Vectored Valued Mesh B=by The Quadtree Method ............................................64

22. Comparison of Considered Mesh Modeling Algorithms.......................................65

vii

Figure Page

23. Mesh Modeling of an Image. .................................................................................85

24. Illustration of a Pixel Model (left) and a Mesh Based Model (right) For The Case

of SPECT Imaging........................................................................................85

25. The Sum of 16 Image Frames................................................................................86

26. Plot of the MDL Function vs. the Number of Mesh Nodes...................................86

27. Illustration of The 16 Cho Input Channels ............................................................87

28. Simulated perfusion defects (indicated by arrows) introduced in the gMCAT

phantom (slice #35). .....................................................................................87

29. The area under the ROC curve...............................................................................88

30. Images reconstructed by the MESH-EM algorithms .............................................89

31. PSNR vs. the number of iterations for different reconstruction methods..............90

32. The computation time for various reconstruction methods. ..................................90

33. 3D mesh model ......................................................................................................92

34. Representative slices of frame #1 reconstructed from different methods..............96

35. Mesh structure (8,887 nodes) obtained from segmented MR image. ....................98

36. Sampling strategy ..................................................................................................99

37. Original MRI image and simulated PET phantom ..............................................103

38. Images obtained by three reconstruction procedures...........................................105

39. Bias-variance curves ............................................................................................106

40. Inter frame displacement estimation using deformable mesh model...................112

41. Mesh structure for representing the displacement-vector field describing motion

within a slice of the torso, including the heart............................................120

viii

Figure Page

42. As the mesh deforms from frame to frame, the nodes trace out curved trajectories

through the space-time coordinate system..................................................120

43. Example displacement-vector fields in the vicinity of the heart..........................121

44. Steps of mesh-generation method........................................................................122

45. Frames from image sequences processed by various methods............................123

46. Time activity curves (TACs) for a small region in the left ventricular wall vs. the

frame number..............................................................................................124

47. Mesh structure used in our experiment................................................................130

48. Rician probability density function (PDF) of the phase distribution,

parameterized by the SNR. .........................................................................133

49. Approximation error vs. the SNR and the angle..................................................134

50. Six brain regions used in our simulation. ............................................................137

51. Class labels obtained by different algorithms applied to the simulation data set.144

52. Class labels obtained by different algorithms. The number of assumed class for

all methods was four. ..................................................................................144

53. A 128 128× section of noisy Lena image............................................................147

54. Rectangular grid and grid after attack by random bending..................................150

55. Inter frame displacement estimation using deformable mesh model...................152

56. Mesh model based watermarking system. ...........................................................157

57. Images to demonstrate the watermarking process ...............................................158

58. Images to demonstrate the watermarking process cont. ......................................159

59. The estimated distortion field. .............................................................................160

60. BER vs. random bending strength. ......................................................................161

ix

Figure Page

61. BER vs. watermark strength. ...............................................................................161

62. Triangle T with vertices , 0,1,2i i =p ...................................................................165

63. Mapping from arbitrary element mD to the master element D% . .........................170

64. Tetrahedron T with vertices , 0,1,2,3i i =p . .......................................................175

65. Master element mapping......................................................................................187

x

LIST OF TABLES

Table Page

1. List of results for various parameters.......................................................................38

2. List of results for various parameters and algorithms..............................................38

3. TAC cross correlation ............................................................................................130

4. Regions/Thalamus Ratio........................................................................................138

5. Percentage of correct classification. ......................................................................143

xi

ABSTRACT

Most of today’s medical imaging modalities are based on tomography, the process of

obtaining slice or volumetric images of the body by way of a computation known as

image reconstruction. Tomographic image reconstruction is an ill-posed inverse problem;

thus, the results can be greatly affected by noise present in the data. Steps must be taken

in the computations to reduce the noise effect and thus improve the quality of the

reconstructed images.

The focus of this thesis is the development of new techniques for image

reconstruction based on content-adaptive mesh modeling (CAMM) of the image.

Specifically, we replace the pixel basis, which is conventionally used to describe images,

with a mesh description that is tailored to the specific image.

The CAMM approach involves partitioning of the image domain into a collection of

non-overlapping patches, called mesh elements, then describing the intensity over each

element through interpolation from the model parameters. The CAMM is content-

adaptive, meaning that the mesh is generated so that dense image samples (i.e., fine mesh

elements) are placed in regions of the image containing high-frequency features while

sparse samples (i.e., coarse mesh elements) are placed in regions containing

predominantly low-frequency features.

Approaches based on CAMM have several potential advantages over pixel-based

methods: 1) the CAMM provides a natural smoothing effect; 2) the CAMM is a more

compact description of the image than a uniform pixel grid, therefore it may require less

memory and computation time; 3) since the CAMM reduces the number of parameters to

be estimated, making discrete inversion overdetermined, and thus improving image

xii

quality; and 4) the CAMM provides a natural framework for estimating motion for

reconstruction of moving image sequences.

In this thesis, the basics of the CAMM are presented and later extended to

tomographic image reconstruction. It is shown that the use of a CAMM in image

reconstruction can achieve good image quality at low computational cost. Further the

CAMM is shown to successfully track the heart wall, which exhibits motion, for use in

reconstruction and post-reconstruction processing of the image sequence.

In addition, a new method for estimating distinct time-sequence basis functions in an

image sequence is presented. This method improves on existing cluster component

analysis of image sequences by better modeling their statistical properties. This approach

may be useful for new temporal basis function approaches for reconstruction.

Finally, a new method of watermarking, a method of embedding a packet of

additional digital data into an image, based on deformable mesh modeling is proposed

that is robust to attack by geometrical deformation.

1

CHAPTER I

1. INTRODUCTION

In this work, we propose new methods for image processing, with focus on image

reconstruction in nuclear medicine. The main theme of the work is the development of

new techniques for image representation and reconstruction based on content-adaptive

mesh modeling (CAMM) of the image. Specifically, we replace the pixel basis, which is

conventionally used to describe images, with a new mesh description that is tailored to

the specific image.

The CAMM approach involves partitioning of the image domain into a collection of

non-overlapping patches, called mesh elements, then describing the intensity over each

element through interpolation from the model parameters. The CAMM is content-

adaptive, meaning that the mesh is generated so that dense image samples (i.e., fine mesh

elements) are placed in regions of the image containing high-frequency features while

sparse samples (i.e., coarse mesh elements) are placed in regions containing

predominantly low-frequency features.

Approaches based on CAMM have several potential advantages over pixel-based

methods: 1) the CAMM is a more compact description of the image, therefore it may

require less memory and computation time; 2) since the CAMM reduces the number of

parameters to be estimated, making discrete inversion overdetermined, it helps reduce the

noise in reconstruction problem; and 3) the CAMM provides a natural framework for

estimating motion for reconstruction of moving image sequences.

In this work, the basics of the CAMM approach are presented and later extended to

tomographic image reconstruction. It is shown that the use of a CAMM in image

2

reconstruction can achieve good image quality at low computational cost. Further the

CAMM is shown to successfully track the heart wall, which exhibits motion, for use in

motion compensated reconstruction and post-reconstruction processing of the image

sequence.

The principal novel contributions of this work are: 1) a new method for generation of

content-adaptive meshes for image processing; 2) the use of CAMM for tomographic

image reconstruction; 3) new methods of processing time-sequences of images by using

spatio-temporal smoothing in the mesh domain; 4) new multivariate-statistical methods

for analysis of temporal behavior in image sequences; and 5) new watermarking, robust

on geometric attack, method.

1.1. The Need for Noise Reduction in Nuclear Medicine

The main application contemplated in this thesis is nuclear medicine, which refers to

a collection of medical imaging methods that can capture functional, as well as structural,

information about the body. More specifically, our principal goal is to improve the

quality of gated (time-sequence) images of the heart obtained by single-photon emission

computed tomography (SPECT), which is one type of nuclear medicine imaging.

A nuclear medicine imaging study begins by administering to the patient a small

amount of a radioactive material, called a radiotracer, either through injection or

inhalation. The radiotracer is an analog of a biologically active substance of known

physiological properties, which is labeled with a radioactive isotope. Its behavior in the

body is the same, or similar, to its naturally occurring counterpart, but it can be imaged

because, as the radioisotope decays, it produces gamma-ray emissions that can be

measured by detectors placed outside the body. The measured emission data are then

3

transformed mathematically, in a process known as image reconstruction, to obtain

images of the spatial and temporal distribution of the radiotracer concentration in the

body. Nuclear medicine images convey valuable information about the physiological as

well as structural properties of the imaged organ.

The image quality obtained by nuclear-medicine imaging is limited by the finite

number of the acquired gamma-ray photons. This is a particular problem in time-

sequence imaging studies, where, because the counts are divided into time intervals to

obtain the image sequence, the number of counts in each time interval is lower than in an

equivalent static study. The aim of the work described in this thesis is to reduce the effect

of noise in the imaging process. To understand why noise is difficult to reduce by other

means, let us consider other factors affecting the noise level (defined by the number of

acquired counts).

a) By increasing the dose of the radiotracer to the patient, one would increase the

number of counts, but that is not acceptable since the allowed dose is limited by

safety considerations.

b) An increase in the acquisition time would increase the number of counts, but this is

limited by the potential for patient motion, patient comfort, and cost.

c) One can consider decreasing the number of time intervals, but this would lead to

increased organ motion blur (in cardiac imaging) and/or reduced temporal resolution

(e.g., in brain imaging).

d) In SPECT, one can gain counts by decreasing spatial resolution of the hardware

(i.e., the collimator), but this loss is unacceptable, because it reduces the ability to

visualize small structures such as cardiac defects.

4

e) Tl-201, a radiotracer that has better physiological properties than Tc-99, produces

less counts. Therefore, it would be preferable to improve noise properties through

better image reconstruction than to use the physiologically less desirable tracer.

For these reasons, improved image reconstruction can provide an important benefit by

reducing the effect of noise.

Before describing the proposed image-processing methods, we begin with a brief

review of nuclear medicine imaging, specifically single photon emission tomography

(SPECT) and positron emission tomography (PET), which are the focus of this thesis.

1.2. Tomography by SPECT and PET

In the context of nuclear medicine, tomography is a method of obtaining images that

reflect the concentration of radiotracer at each point in the body, as distinguished from

planar imaging which only produces projections (similar to line integrals) of this

distribution.

SPECT imaging uses radiotracers that emit single photons; PET uses radiotracers that

emit a positron, which undergoes mutual annihilation with a neighboring electron, and

produces two photons.

1.2.1. Single Photon Emission Tomography (SPECT).

In SPECT, a gamma camera is used to detect the emitted photons (gamma rays) (see

Figure 1). Localization of the photon source is made possible by using a collimator,

which is a thick perforated metal sheet placed in front of the detectors. The aim of the

collimator is to allow only parallel rays to reach the detectors. In practice, the collimation

is not perfect and a cone of rays is allowed to pass (this is viewed as a form of blur). With

the collimator in place, the gamma camera measures (neglecting the blur) a set of line

integrals of the radiotracer distribution along lines parallel to the collimator holes

5

(perpendicular to the camera face). Multiple images are obtained by the gamma camera

(from different angles) from which the two-dimensional (2D) or three-dimensional (3D)

radiotracer concentration can be reconstructed.

A significant drawback of SPECT is its low sensitivity which is due to the fact that

the collimator identifies parallel rays and discarding all other rays. Since the amount of

radiotracer that can be injected into a patient is limited, the images obtained by SPECT

suffer from low signal-to-noise ratio. But SPECT is widely used because it is inexpensive

and does not require a cyclotron to produce the radiotracer.

Figure 1. Single Photon Emission Computed Tomography (SPECT) uses a GammaCamera with a Collimator to Measure the Radiotracer Distribution

Because of the finite size of the collimator holes, each detected photon can be

projected back to a cone of possible origins. This makes the point spread function (PSF)

of the gamma camera depth-dependent (the width of the cone increases with distance

from the camera). Image quality in SPECT is also limited by scatter and by the difficulty

of correcting for the non-uniform attenuation of gamma rays by the body.

6

1.2.2. Positron Emission Tomography (PET).

PET differs from SPECT primarily in the way that the direction of the gamma rays is

determined. In SPECT, the collimator acts as a physical sieve that aims to allow through

only rays traveling in a certain direction. In PET, an electronic collimation scheme is

used that leads to better sensitivity. The PET system (Figure 2) consists of a ring of

detectors. In PET, because a positron-emitting radiotracer is used, pairs of gamma rays

are emitted, which travel in nearly opposite directions. If two different detectors sense the

two photons at roughly the same time, the photons are assumed to have been created by a

positron annihilation event somewhere along the line connecting the two participating

detectors.

Figure 2. Coincidence Detection. True coincidence (solid line), singles and randomcoincidence (dashed line), and scatter coincidence (broken line) [1]1.

The causes of blur in PET are different from those in SPECT. The principal blurring

effects are: 1) finite detector size, 2) positron range (the positron travels some distance

1 Corresponding to numbered references in the bibliography

7

before emitting the photons that reveal its position), 3) angulation error (the photon pairs

do not travel in exactly opposite directions, and 4) scatter (the photons may be deflected

causing a misleading indication of their point of origin, see Figure 2).

An additional source of image degradation is the contribution of accidental

coincidences, also known as randoms. Randoms are false events caused when two

photons emitted from separate events happen to be detected at the same instant of time.

The imaging system cannot discriminate these accidental coincidences from true ones.

1.3. PET and SPECT Studies based on Image Sequences

To study physiological processes in the body, PET and SPECT studies are frequently

based on a sequence of images depicting changes in the radiotracer distribution with time.

We will refer to one image in a sequence as frame.

There are two ways to acquire an image sequence in PET and SPECT: gated studies

and dynamic studies. In a dynamic study, image frames are acquired sequentially as in a

video or movie, usually over a period of 10 minutes to 2 hours. A dynamic study usually

seeks to determine the way in which the radiotracer interacts with the body as a function

of time, and thereby learn something about the tissue. The organs imaged in a dynamic

study are assumed to be motionless; the temporal variations of the image are due only to

the dynamics of radiotracer concentration. From a dynamic study, one usually measures

the time variation of radiotracer concentration in a region of interest (ROI). A graph of

radiotracer concentration (activity) as a function of time is known as a time-activity

curve.

A gated study is used to image an organ that moves periodically. Gated imaging,

which is similar to stroboscopy, produces a looped image sequence, with each cycle of

8

the loop representing one cycle of the periodic organ motion. For example, one cycle of a

gated cardiac sequence depicts one cardiac cycle (one heartbeat).

In imaging the heart, there are actually two periodic motions involved: the motion of

the heart itself, and the motion of the entire torso caused by respiration. Usually only

cardiac gating is used, but respiratory gating is receiving increasing interest recently.

Using both cardiac and respiratory gating would improve image quality, but it reduces the

number of counts per frame, and thus leads to unacceptable image quality. Using the

proposed image-reconstruction techniques, we hope to make respiratory gating feasible

by reducing the effect of noise.

A gated image sequence is obtained by synchronizing the data acquisition process to

the cardiac rhythm, beginning at the peak of the R-wave, as measured by

electrocardiography (see Figure 3). Each frame of a gated sequence is actually the sum of

many time intervals. For example, frame 1 of the sequence is obtained by summing

images obtained during the short time interval following the beginning of many different

cardiac cycles, frame 2 is obtained by summing all of the second time intervals, etc.

Typically, a gated cardiac image sequence consists of eight frames showing one period of

the cardiac cycle, synthesized from hundreds of actual heartbeats. By using the proposed

techniques, we hope to reduce the noise effect sufficiently to make 16-frame acquisitions

of acceptable quality.

The advantage of gating is that it reduces the motion blur that would be present if the

heart were imaged with a long integration time. This allows the clinician to observe

subtleties of wall motion, and small perfusion defects that would not be visible otherwise.

9

Figure 3. Gating by Electrocardiography

The cost of both dynamic and gated image acquisitions is that the number of counts in

each frame is less than what would be obtained by integrating counts over the entire

imaging time [2]. Hence, the noise in each frame is higher than in a static image. The

additional noise calls for special image reconstruction and processing methods, which are

the subject of this thesis.

1.4. Content-Adaptive Mesh Modeling

In Chapter 2, we will consider content-adaptive mesh modeling (CAMM) for image

representation. In recent years, mesh modeling of images has found several applications

in image processing, including image compression, motion tracking and compensation,

and medical image analysis (see, for example, [3-7]). Mesh modeling of an image

involves partitioning the domain of the image into a collection of non-overlapping

(generally polygonal) patches, called mesh elements, then describing the intensity over

each element through interpolation. Mesh modeling provides an efficient and compact

representation of an image and, more importantly, is an effective tool for tracking rigid

and non-rigid motion in image sequences. A critical issue in mesh modeling is how to

determine the best mesh structure for a given image function. Several approaches for

mesh generation have been proposed in the literature. One approach is to begin with an

10

initial model of the image (such as a coarse regular mesh), then to refine the model in a

hierarchical manner in order to reduce the approximation error [8-10]. Other approaches

include physics-based modeling [11], and global mesh optimization [6]. In our group’s

initial work [12] a computationally efficient approach for content-adaptive mesh

generation was investigated. The proposed approach was motivated by non-uniform

sampling [13], in which the mesh model is treated as a representation of an image using

non-uniform image samples. The basic idea was to generate a mesh such that dense

image samples (correspondingly, fine mesh elements) are placed in regions of the image

containing high-frequency features while sparse samples (i.e., coarse mesh elements) are

placed in regions containing predominantly low-frequency features. The proposed mesh

generation algorithm is fast, non-iterative, easy to implement, and proven to be relatively

accurate.

Numerical results demonstrate that the new approach can yield a compact, accurate

representation of images, when compared with several other methods, at a very low

computational cost.

1.5. Novel Tomography Image Reconstruction

In Chapters 3 and 4 we propose the use of a CAMM for tomographic image

reconstruction. In Chapter 3 a method is proposed in which the reconstructed image is

modeled by an efficient mesh representation, as described in Chapter 2, and then

reconstructed by estimation of the model parameters (nodal values) from the measured

data. In Chapter 4 we extend CAMM reconstruction to dual-modality reconstruction

where we incorporate an anatomical prior (e.g. obtained from CT or MRI).

11

In general, the use of a CAMM reduces the number of unknown parameters to be

estimated, therefore it can greatly alleviate the ill-condition nature of the reconstruction

problem, leading to improved quality in the reconstructed images. In addition, it can lead

to development of efficient numerical reconstruction algorithms. The proposed methods

are tested using simulated gated cardiac-perfusion SPECT images. Our results indicate

that, among the methods tested, the proposed approach achieves the best performance in

terms of image quality and computation time, and can also reduce the memory

requirement.

In the literature a great many methods have been developed for image reconstruction.

The following is a brief review of this literature. A rough outline of developments in

tomographic image reconstruction is shown in Figure 4. This diagram portrays the field

as it relates to the present work. Other topics in the reconstruction field, such as scatter

and attenuation correction, 3D rebinning methods, diffraction tomography, etc., are not

depicted in the diagram. Also, only one representative reference is shown for each topic,

but it should be noted that a large literature exists for many of the topics shown in the

diagram.

12

Figure 4. Classification of tomographic image reconstruction algorithms.

In clinical practice, images are usually reconstructed from the data by filtered

backprojection (FBP), a long-standing method based on an idealized model of the

imaging process [14]. It is well known that, by itself, FBP produces images that are far

from optimal, particularly for use in the performance of quantitative tasks [15]. However

FBP continues to be used in practice, largely because of its computational advantages.

A great many alternative reconstruction methods have been proposed to improve the

quality and quantitative accuracy of PET and SPECT images. These methods achieve

their aim by incorporating more-appropriate models of the imaging process, and by

introducing a priori information about the object. In PET, imaging models may include

space-variant detector-aperture blur, scatter, random coincidences, positron range,

13

angulation error, nonuniform sampling, and Poisson noise. Algorithms other than FBP

usually assume knowledge of the spatial response functions of the imaging process;

however, we have recently investigated approaches to the realistic problem in which

these factors are not known exactly [16-18].

Image-reconstruction methods include Fourier, statistical (including least-squares),

and algebraic (including constraint-based) approaches. In addition, methods for restoring

the projection data (sinogram) prior to FBP reconstruction have been explored; these are

reviewed in the following section.

Fourier methods include FBP and direct Fourier inversion [19, 20], both results of the

well-known Central Slice Theorem of tomography [19]. In theory these are equivalent

analytic solutions to the ideal problem; in practice, direct inversion requires fewer

computations, but involves an interpolation step in the Fourier domain.

Statistical image-reconstruction algorithms include maximum-likelihood, maximum a

posteriori (MAP, or Bayesian), maximum-entropy, and least-squares methods.

Maximum-likelihood (ML) methods [21-23] seek the image that maximizes the

likelihood function, i.e., the conditional probability density function (PDF) of the data

given the image. In nuclear medicine the data are usually assumed Poisson-distributed,

thus the expectation-maximization (EM) algorithm proposed by Dempster, et al., [24] has

been widely studied as a way to compute ML estimates since it provides tractable

iterative solutions for exponential PDFs.

Unconstrained, ML image estimates are exceedingly noisy since they contain no prior

information about the image’s true characteristics [25]. To address this problem

considerable attention has been paid to maximum a posteriori (MAP), or Bayesian,

14

methods which incorporate a priori image models in their computations [26-37]. A class

of methods where a priori knowledge is obtained from anatomical modalities are

explored in [38] and [39]. Penalized weighted least-squares methods have been proposed

which have a similar goal but do not rest on the Poisson assumption [40]. Snyder and

Miller [25] proposed the method of sieves as way to control noise in ML estimation.

Various criteria for premature stopping of the EM algorithm have also been proposed to

avoid noise amplification [41]. Maximum-entropy methods use an entropy criterion to

reduce image artifacts [42-44].

Algebraic methods include a variety of iterative techniques that, in general, omit

explicit statistical models of the data. PET and SPECT are usually modeled as linear

systems, thus the solution of a system of linear equations lies at the heart of the

reconstruction problem. Kaczmarz [45] proposed an iterative method for solving linear

systems. Applied to image reconstruction, this method falls in the category of algebraic

reconstruction techniques (ART) [46-49] and is a special case of the method of

projections onto convex sets (POCS) [50-53], an approach that frames the reconstruction

problem as one of constraint satisfaction.

In recent years, accelerated iterative techniques have been developed, in which the

iterative steps act only on subsets of the pixels [54-56].

Alternative model-based 2D and 3D reconstruction approaches have been suggested

for certain applications. For example, cylindrical models were proposed in [57] and

surface models were used in [58, 59].

Methods of four-dimensional (4D) or spatio-temporal processing have received

increasing interest lately. Several 4D methods have been proposed by Wernick, et al., for

15

reconstruction of motion-free images, such as those obtained in dynamic PET studies

[60]. Lalush and Tsui [61] applied 4D image reconstruction to cardiac SPECT images,

but did not incorporate motion estimation explicitly in their techniques. In the broader

image-processing field, motion-compensated processing is a well-known approach to

reduce the noise in an image sequence [62]. In the nuclear medicine field, Klein, et al.,

[63] developed a motion-compensated summing method using motion estimation, based

on the optical-flow method [64, 65], for obtaining a single image from a gated PET study.

In addition, pre-processing of the acquired data prior the reconstruction was also

proposed by various authors [18, 66-69]. Further, a review of post-processing techniques

for general image-sequence processing can be found in [62, 70], and post-processing

temporal regression smoothing is described in [71].

In Chapter 5 we present a new 4D reconstruction and post-processing approach for

reducing noise in gated SPECT perfusion images while preserving accurate cardiac

motion. The method is based on motion-compensated temporal smoothing using a

deformable content-adaptive mesh to model cardiac motion. The new, fast method for

CAMM, described in Chapter 2, for initial mesh generation was used. This mesh is then

deformed to track cardiac motion, and smoothing is performed along motion trajectories

through the space-time coordinate system. Our preliminary results show that the proposed

method is very promising.

Note that, in Figure 4, we have indicated a proposed algorithm for CAMM

reconstruction and a proposed 4D CAMM reconstruction and motion-compensated

filtering approach.

16

1.6. Similar Component Analysis

In addition, in Chapter 5 a new method for decomposing the image sequence into a

set of distinct temporal basis functions is presented. Each distinct basis function

corresponds to a unique direction, independent of amplitude in an M dimensional space

(for a sequence consisting of M frames).

Identification of regions with distinct time behaviors is expected to have the

following applications.

1) Regions in the imaged organ that share similar interaction with the radiotracer can

be identified by determining that they share similar time behaviors. From this, an

automated algorithm for radiotracer kinetic-model parameter estimation can be derived.

2) The method can be used to compute regions, having similar time behavior, to

which similar temporal smoothing should be applied during image reconstruction.

3) The method can be used to identify important temporal functions in functional

neuroimaging images, which are used to characterize the function of the brain during

various tasks.

The problem of identifying important temporal basis functions in an image sequence

is essentially a problem of clustering multivariate statistical signals. However, in medical

imaging it is usually desirable for the clustering to be independent of signal amplitude.

Thus, clustering methods based on Euclidean or Mahalanobis distance are not always

suitable.

Most traditional clustering algorithms, e.g. k-means [72] or Gaussian mixture models

[24, 73] are dependent on the signal amplitude. Principal component analysis (PCA) and

independent component analysis (ICA) [74] have been used to extract distinct directions;

however, both method have potential disadvantages in that the desired basis functions are

17

not necessarily orthogonal (as in PCA) and the components are not necessarily

independent (as in ICA).

Recently, clustered component analysis (CCA) was developed in [75] in an attempt to

avoid the amplitude dependency of the conventional algorithms. In Chapter 5 we propose

a different method we call similar component analysis (SCA) that incorporates a data

model in which class labeling, and accordingly the TAC estimation, does not depend on

the signal amplitude but rather on distinct directions in the M dimensional space.

Promising preliminary results are shown.

1.7. CAMM Extensions

In Chapter 6 we present extensions in use of CAMM. Specifically we propose a de-

noising procedure where the dimensionality reductions introduced by CAMM are

utilized. Further in the same chapter an extension of deformable mesh modeling is

proposed to obtain a robust digital watermarking procedure. The robustness of proposed

method is reflected as algorithm ability to compensate for random bending attack.

18

CHAPTER II

2. CONTENT ADAPTIVE MESH MODELING

In a mesh model representation, the function (e.g. image) support domain is

subdivided into a number of mesh elements, the vertices of which are called nodes. The

function is then obtained over each element by interpolation from its nodal values [76].

Further, in CAMM the mesh elements are placed adaptively to correspond to the local

content of the image. The number of resulting mesh nodes is typically much less than the

number of pixels. Using this compact representation helps in the tomographic image

reconstruction to combat high level of noise. In this chapter we derive the CAMM for

representation of scalar valued function with 2D and 3D support as well as vector valued

function with 2D support region, which are used in later chapters as part of the

tomographic image reconstruction process.

First we provide a theoretical basis to the concept behind this approach. A result on

the error bound of a mesh model based on the theory of function interpolation. From this

result, a more accurate way for placement of mesh elements in the image domain

according to the image content is proposed. The result is that a much more accurate mesh

representation can be obtained for the image at a very low computational cost.

2.1. Scalar 2D Function Mesh Modeling

2.1.1. Theoretical Basis.

Mesh representation

Let f pa f denote the image function defined over a domain D , with p = x y,a f. In a

mesh model, the domain D is partitioned into a number, say M , of non-overlapping

19

mesh elements denoted by D m Mm , ,2, ,= 1 L . Then the image function is approximated

as

( ) ( ) ( )1

ˆN

n nn

f fϕ=

= ∑p p p , (2.1)

where pn is the nth mesh node defined by pn n nx y= ,b g , ϕ n pa f is the interpolation basis

function associated with pn , and N is the total number of used mesh nodes. Note that the

support of each basis function ϕ n pa f is limited only to those elements Dm attached to the

node n.

In practice, the elements Dm are often chosen to be triangles or quadrangles because

of the geometrical simplicity of these shapes. In this study triangular elements are used.

The mesh representation in Eq. (2.1) assumes a form of signal representation based on

non-uniform sampling. By definition, each basis function ϕ n pa f in Eq. (2.1) has support

only over its associated element Dm . Thus, the total contribution by a particular nodal

value f npb g to the image is strictly limited to those elements associated with pn . For

notational simplicity, it is generally assumed in this section that the image function f pa fis defined over a two-dimensional (2D) domain D . Note, however, that the rest of the

development can also be extended directly to a higher dimensional case such as 3D. This

will be explored in Sect. 2.2.

Error Analysis

The following question arises immediately regarding the mesh representation in

Eq.(2.1): how accurate is it when used to approximate an image? One would expect its

accuracy to a large degree depends on the geometry of the mesh elements Dm and their

associated basis functions ϕ n pa f . While it is generally true that a higher-order basis can

20

provide a better approximation than a lower-order one, its model complexity also

increases considerably. For example, in the case of triangular elements, a piece-wise

linear interpolation would require the use of only three nodes (i.e., the vertices) over each

element, while a piece-wise quadratic interpolation would require as many as six nodes

(one extra node along each side besides the three vertices) [77]. As a result, linear basis

functions are often used in practice in favor of their simplicity. Accordingly, in this work

we focus on the use of linear basis functions. However, the approach taken in this work

can be readily extended to other types of basis functions as well.

While error bounds in different mathematical forms have been derived for the

representation in Eq. (2.1) in finite element analysis [77, 78] where the goal was to

provide algorithmic convergence analysis, we derive below a result for the purpose of

adaptive mesh generation. In the interest of brevity, we simply state the main result

below, while a detailed derivation can be found in [79] and Appendix A. The derivation

of the error bound for more general case can be found in [80].

Theorem 1. Let T denote a triangle on the two-dimensional plane R2 , and let f pa fdenote real-valued image functions defined on T whose 2nd partial derivatives are

continuous on T . Assume that $f pa f is the linear interpolation of f pa f at the vertices of

T . Then for each point p∈T

f f M hp pa f a f− ≤$ 22

4, (2.2)

here h is the length of the longest side of mesh element T and M2 is given by:

M fT2 0 2

≡∈ ∈

max max,

''

xp

θ π θ a f , (2.3)

21

where fθ'' pa fdenotes the 2nd-order directional derivative of f pa f at point p along the unit

vector uθ θ θ= cos ,sina fT .

2.1.2. Mesh Modeling.

The result in Eq. (2.2) provides a fundamental basis for the development of our mesh

generation algorithm to be described. First, it states that the approximation error bound is

proportional to the maximum magnitude assumed by the 2nd directives of the image

function over the element T . Second, it states that this error bound is also proportional to

the square of the length of the longest side of T . Note that the latter is simply

proportional to the area of T provided that it is not excessively elongated (i.e., T not

having an angle too small). Based on the above observation, we argue that a good mesh

generation scheme should try to place small (in area) elements in regions of an image

where its 2nd directional derivative is large, and conversely, larger elements should be

used in regions where the 2nd directional derivative is relatively small in order to achieve

a balanced error level throughout the image. Equivalently, the resulting mesh from such a

mesh generation scheme will have the property that the local spatial density of the mesh

nodes (i.e., the vertices of the triangles) is proportional to the magnitude of the 2nd

directional derivative of the image.

2.1.3. Proposed Mesh Generation Algorithm.

We propose a mesh generation scheme that will achieve this goal. It consists of three

major steps, as illustrated in Figure 5. First, a feature map, σ pa f, is extracted from the

image

This feature map is used to describe the spatial distribution of the largest magnitude

of its second directional derivatives. In the second step, we employ a classical error-

22

diffusion algorithm (the well-known Floyd-Steinberg algorithm [81]) to place mesh

nodes with density proportional to the extracted feature map σ pa f in the image domain.

In digital-halftoning the objective is to use the spatial density of ink dots to represent the

image intensity. The Floyd-Steinberg algorithm is a numerically efficient, single-pass

algorithm that can produce good results and is widely used in digital halftoning [82]. Its

basic idea is to distribute ink dots adaptively in the image domain so that the density is

proportional to the local image intensity.

Figure 5. Proposed mesh-generation procedure.

In the third and final step, a 2D Delaunay triangulation algorithm [83] is used to

connect the obtained mesh nodes. Delaunay triangulation is known to yield a well-

structured mesh at a reasonable computational cost. Most importantly, the use of

Delaunay triangulation in our case can keep from having excessively elongated elements,

thereby further reducing the error bound in Eq. (2.2). The resulting mesh structure

consists of triangular mesh elements that are automatically adapted to the content of the

image. The details of these steps are described in the rest of the work. Before we proceed,

however, we want to discuss one issue associated with the assumption made in Theorem

23

1, that is, the image function is assumed to have 2nd partial derivatives which are also

continuous on T . One may question whether this condition is too restrictive for

describing images in practice or not. After all, features such as edges and textures often

exist in real life images, and these features are often known to be associated with “image

not being continuous”, not to mention even being differentiable. Generally speaking,

however, an analog image function f pa f is often first converted to a digital form through

sampling before it can be processed by a computer. According to Shannon's sampling

theorem the analog image function f pa f has to be band-limited (with bandwidth defined

by the physical imaging system) to avoid aliasing effect [84, 85]. Based on Fourier

analysis a band-limited function is known to be also analytic on its entire domain,

meaning all of its derivatives exist and are continuous [86, 87].

Therefore, the seemingly restrictive assumption is not an issue at all for digital

images, as the differentiability is always met by their analog counterpart.

2.1.4. Content-Adaptive Mesh Generation.

Image Feature-Map Extraction

For convenience, let’s denote the largest magnitude of the 2nd-order directional

derivative of ( )f p at point p that is

( )[ ]

( )''

0,2maxG fθθ π∈

=p p . (2.4)

The feature map function σ pa f is determined as

( ) ( )GA

γ

σ

=

pp , (2.5)

24

where is the largest value of ( )G p over image domain D , and γ is a constant typically

with 0 2< ≤γ . The role of A in Eq.(2.5) is simply to normalize the 2nd derivative

magnitude G pa f within the range between 0 and 1, while γ ∈ 0 1,a is used to enhance

weak edge features in the image and opposed γ ∈ ∞1,a is used to suppress weak features,

e.g., the noise.

For computation of G pa f, one can derive the following:

Corollary 1. Let pH denote the Hessian matrix of ( )f p at p and let λ1 2, pa f denote

the two eigenvalues of pH . Then we have

( ) ( ) ( ){ }1 2max ,G λ λ=p p p . (2.6)

Proof. Recall from calculus that the directional derivative of at p along a unit-

directional vector ( ) ( )( )cos ,sinT

θ θ=u denoted by ( )fθ′ p , is related to its gradient

( )f∇ p by

( ) ( )Tf fθ θ′ = ∇p u p . (2.7)

Thus, the 2nd derivative of f pa f along uθ at p can be written as

( ) ( )( )T T Tf fθ θ θ θ θ′′ = ⋅∇ ⋅∇ = pp u u p u H u , (2.8)

where the Hessian matrix Hpis defined as:

( ) ( )( ) ( )

xx xy

xy yy

f ff f′′ ′′

= ′′ ′′ p

p pH

p p, (2.9)

here ( )xxf ′′ p , ( )xxf ′′ p and ( )xxf ′′ p denote ( )2

2

,f x yx

∂∂

, ( )2

2

,f x yy

∂∂

and ( )2 ,f x yx y

∂∂ ∂

respectively.

25

Clearly pH is a real-valued, symmetric matrix. From linear algebra the eigenvalues

and eigenvectors of pH are also real-valued. Furthermore, for any vector u ∈R2

( ) ( ){ }1 2max ,T λ λ≤pu H u p p u , (2.10)

where u is the Euclidean norm of u . Thus from Eq.(2.4), (2.8) and (2.10) follows:

( )[ ]

( ) ( ) ( ){ }''1 20,2

max max ,G fθθ πλ λ

∈= =p p p p . (2.11)

Corollary 1 thus follows immediately.

The eigenvalues of pH are computed as shown in the following:

( ) ( ) ( )( ) ( ) ( )( ) ( )( )2 2

1,21 12 4xx yy xx yy xyf f f f fλ ′′ ′′ ′′ ′′ ′′= + ± − +p p p p p p . (2.12)

Adaptive Mesh-Node Placement

The Floyd-Steinberg algorithm [81] is a classical error-diffusion algorithm widely

used in digital halftoning, where the spatial density of ink dots is used to represent the

image intensity. Its basic idea is to distribute adaptively the ink dots in the image domain

according to the image intensity so that the spatial density of the ink dots varies in

proportion to the latter (see Figure 6). It is realized that this algorithm serves as a natural

tool for the placement of mesh nodes in our application. The Floyd-Steinberg algorithm is

a simple and efficient one-pass algorithm.

26

Figure 6. Detail of the image processed by error diffusion.

For this purpose we assume that ( )σ p is computed in the form of discrete pixels,

denoted by ( , )i jσ at pixel ( , )i j . The specific steps of the algorithm are as follows:

1. Beginning with the first pixel in the image, proceed in a raster scanning order.

2. At each pixel ( , )i j , compare the feature map ( , )i jσ against a prescribed

threshold q . For convenience, define

1, if ( , )( , ) 2

0, otherwise

qi jb i j

σ ≥=

. (2.13)

Then a mesh node is placed at ( , )i j when ( , ) 1b i j = , and no node is placed

otherwise. In addition, compute the quantization error at ( , )i j as

( , ) ( , ) ( , )e i j i j b i j qσ= − . (2.14)

27

3. Diffuse the quantization error at ( , )i j to its four immediate causal neighbors in

proportions. Specifically, the values of the feature-map at the neighboring pixels

are adjusted in the following manner

1 2

3 4

( , 1) ( , 1) ( , ), ( 1, 1) ( 1, 1) ( , ),( 1, ) ( 1, ) ( , ), ( 1, 1) ( 1, 1) ( , ).i j i j w e i j i j i j w e i ji j i j w e i j i j i j w e i j

σ σ σ σσ σ σ σ

+ = + + + − = + − ++ = + + + + = + + +

(2.15)

In this study we set the weights , 1,2,3,4iw i = , to 7/16, 3/16, 5/16, and 1/16,

respectively.

4. Repeat Steps 2 and 3 until the end of the image is reached.

The prescribed threshold q in (2.13) is used to control the number of mesh nodes

produced by the mesh-generation procedure. We have shown previously [79], that q is

related to the number of resulting mesh nodes N in the following way:

1 ( , )i j

q i jN

σ≈ ∑∑ , (2.16)

where the summation is over all the pixels in the image domain.

Delaunay triangulation

Delaunay triangulation connects a given set of mesh nodes in such a way that the

circle circumscribing any triangular element contains only the nodal points belonging to

that triangle, Figure 7, (except in the case where four or more nodal points are co-

circular). Delaunay triangulation can yield a well-structured mesh at a reasonable

computational cost. Most importantly, the use of Delaunay triangulation in our case can

avoid producing excessively elongated elements, thereby further reducing the error bound

in Eq.(2.2).

28

Figure 7. Example of Delaunaly triangulation. Note that circle circumscribing anytriangular element contains no nodes.

Least Squares Fit vs. Interpolation

Now let’s denote a vector formed by the nodal values of the mesh model by:

( ) ( ) ( )1 2, , nf f f≡ T

n p p pL . (2.17)

If $f denotes the lexicographically ordered pixel representation of ( )f p , which is the

approximation of the image function ( )f p over D , then from Eqs. (2.1) and (2.17) one

can obtain

ˆ =f Φn , (2.18)

where Φ is a matrix, determined from the interpolation functions ( )nϕ p in Eq.(2.1) by:

1 2[ , , , ]nϕ ϕ ϕ=Φ L , (2.19)

here nϕ is column vector of the lexicographically ordered pixel representation of

the ( )nϕ p . The matrix Φ is simply the interpolation operator from a mesh representation

to the pixel representation. A procedure for calculating ( )nϕ p will be given later in this

chapter.

The least-squares solution is obtained from the following optimization problem:

29

( ) 2min minLSJ = − n nn f Φn , (2.20)

where ⋅ is the Euclidean norm and f denotes the pixel representation of the ( )f p . In

such a case, the objective function is quadratic in terms of the unknown n, and a unique

solution exists (provided that the matrix Φ is of full rank) and it is given by:

( ) 1T TLS

−=n Φ Φ Φ f . (2.21)

In this study, the conjugate gradient algorithm [88] was used to obtain the LSn .

Interpolation over irregular mesh elements

In general, the interpolation over irregular elements is quite difficult. To simplify the

interpolation, an irregular element mD can be mapped/deformed to a regular master

element, D% , which has a simpler shape and well-defined interpolation function. This

mapping is illustrated in Figure 8. More details can be found in [76]. Here we summarize

the necessary steps of implementing the proposed algorithm for image reconstruction.

30

Figure 8. Transformation of the interpolation over the irregular element Dm tointerpolation over the regular element ~D .

Let’s define [ , ],s t D= ∈u u % which is a coordinate in the master element. The inverse

mapping from an irregular element to the master element is defined as:

( )( ) ( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )( )( ) ( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )( )

,2 ,3 ,3 ,2 ,2 ,3 ,3 ,21

,2 ,1 ,1 ,3 ,3 ,1 ,1 ,3

1 n m n m n m n m n m n m n m n m

mm n m n m n m n m n m n m n m n m

x y x y y y x x x yw

J x y x y y y x x x y−

− + − + − = − + − + −

p , (2.22)

where [ , ]x y=p , mD∈p , are coordinates in the irregular mesh element mD ;

n m k k, , ,2,a f = 1 3, is the global index of node n expressed as the kth node of the mth mesh

element mD ; and mJ is the Jacobian of the mapping which expresses the change in the

area by the mapping.

The Jacobian mJ can be evaluated using:

( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

,1 ,3 ,1 ,3

,2 ,3 ,2 ,3

detn m n m n m n m

mn m n m n m n m

x x y yJ

x x y y

− −=

− −. (2.23)

31

Now we define nth node interpolation function, ( )nϕ p , as:

( ) ( ) ( )( )1,n mk m n

m M

wϕ φ −

∈

= ∑p p , (2.24)

where ( ),k m n is a local index of the nth node of the mth mesh element mD , M is the

number of elements attached to the nth node, and ( )kφ u is expressed by one of the

corresponding interpolation functions over the master element given by

( )1 sφ =u , ( )2 tφ =u , ( )3 1 s tφ = − −u . (2.25)

2.1.5. Content Adaptive Mesh Model Experiments.

In this section we present some numerical results to demonstrate the performance of

the proposed mesh generation approach. First, the effects of several parametric factors

associated with the algorithm implementation are investigated. Second, for comparison

purposes, results obtained by the well-known quadtree mesh generation method as well

as a mesh optimization algorithm described in [89] are presented. Finally, the

performance of the algorithm is investigated using noisy images.

For evaluation of the accuracy of a mesh representation, the following two

quantitative measures are used: 1) the peak value of the error ( ) ( )ˆf f−p p computed

over each mesh element; and 2) the overall mean square error, computed in the form of

peak-signal-to-noise ratio (PSNR). Specifically, for images of size M N× ,

2

225510log dB

ˆM NPSNR

× ⋅ = − f f

, (2.26)

where $f and f denote the vector form of ( )f p and ( )f p , respectively.

32

Basic Mesh Generation Algorithm

For better visualization of the results, a section cropped from the original 256 256×

image “Lena” is used as a test image, which is shown in Figure 10(a). As it can be seen,

this section contains most of the key features of the original, full-size image, including

textures, sharp edges, smooth transitional regions, as well as smooth background.

Results

Shown in Figure 10(b) is the feature map obtained from the “Lena” image. To

compute the feature map ( )σ p , the value 1γ = and the exact ( )G p were used in

Eq.(2.5). Shown in Figure 10(c) is the mesh structure obtained using the proposed

algorithm, for which the serpentine scan order was used in the error-diffusion step. The

mesh has only 3353 mesh nodes (about 20% of the number of pixels in “Lena”). As it can

be seen, mesh elements were automatically placed densely in regions containing high-

frequency features (such as edges), while coarse elements were placed in relatively flat

areas of the image. This mesh structure was then used to reconstruct the image by using

Eq.(2.1); the image was computed over each triangular element using linear interpolation.

The resulting image is shown in Figure 10(d), of which PSNR=31.13dB. In addition, the

image was also reconstructed from the same mesh using LS fit according to Eq.(2.21) and

the result is shown in Figure 10(e), of which PSNR=33.07dB. As expected, the LS fitting

yield a smaller MSE.

Effect of parameters

The proposed algorithm was also tested using the same “Lena” image by varying a

number of variable parameters and features of the algorithm, including 1) use of the

simplified feature map ( )σ p ; 2) the parameter in ( )σ p ; 3) regular raster order for

33

placement of mesh nodes; and 4) the number of mesh nodes N. For easy comparison,

these parameters are varied one at a time, with the others being kept the same as

previously used in Figure 10. The numerical results are summarized in Table 1, in which

the PSNR and the mean and standard deviation of the peak peake over all the elements

were computed for each reconstructed image. For each parameter setting, reconstructed

images were obtained using both direct interpolation and an LS fit.

Besides the quantitative numerical results in Table 1, in particular we also show in

Figure 11(a) and (b) the obtained mesh structure and reconstructed image (LS fit),

respectively, for the case of varying γ from 1 to 12 . Clearly, when compared to Figure

10(c), the mesh nodes in Figure 11(a) were more densely distributed among regions with

weak image features (such as weak edges and relatively flat areas). Similarly, the

obtained mesh structure and reconstructed image (LS fit) are shown in Figure 11(c) when

the number of mesh nodes N was reduced from 3349 to 2081 (nearly a 38% reduction).

Other test images

The proposed algorithm was also tested using other images. Shown in Figure 12(a) is

an image, consisting of 84 144× pixels, obtained in one of our studies of gated single-

photon emission tomography (SPECT) cardiac perfusion images. SPECT cardiac images

are typically very noisy, and exhibit low resolution. A 4D image-reconstruction algorithm

was applied to this image to reduce its noise level. In contrast to the “Lena” image,

features in this image are very low-frequency and no clear definitions of edges exist. We

chose this image to demonstrate that the proposed algorithm works equally well in such a

case. Figure 12(b) shows the resulting feature map when it was computed directly from

the original image in Figure 12(a). As it can be seen, the feature map is quite noisy in this

34

case, due to the noise present in the original image. To help alleviate the noise effect, the

image in Figure 12(a) was first processed by a lowpass filter (bandwidth of 0.2π). The

feature map was then computed from this processed image, and is shown in Figure 12(c).

The mesh structure and reconstructed images from the proposed algorithm are shown in

Figure 12(d) and (e), respectively. The number of mesh nodes used is only 554 (less than

5% of the number of pixels in the original image). The reconstructed image has

PSNR=38.38dB. Clearly, the mesh representation is particularly accurate in this case. In

fact, the difference between the reconstructed image and original is visually barely

noticeable. When an LS fit is used, the reconstructed image has PSNR=40.57 dB (shown

in Figure 12(f)).

2.1.6. Quadtree Mesh and Mesh optimization.

Quadtree Mesh

The quadtree method is a well-known existing method for adaptive mesh generation.

It starts with a coarse rectangular mesh (e.g., the whole image as one element). This mesh

is then successively refined in a hierarchical manner so that at each step the mesh element

with the largest interpolation error is further split into four sub-elements (see Figure 9).

This process continues until the desired level of accuracy (or the number of nodes) is

achieved. In our implementation, the interpolation over the rectangular mesh is computed

by first converting it into a triangular mesh, then interpolating the image over the

resulting triangular mesh elements. For the “Lena” image, the quadtree method yields a

mesh shown in Figure 13(a), where the same number of mesh nodes N=3353 was used.

The reconstructed image using this mesh is shown in Figure 13(b) with PSNR=29.71dB.

Notice that artifacts are pronounced along the edges in this image. In addition, in Figure

35

14 we show a histogram plot of the element peak interpolation error epeak computed from

the mesh representation using both the proposed method Figure 10(c) and the quadtree

method Figure 13(a). As it can be seen, the mesh representation from the proposed

method has a significantly larger number of elements with very low error level when

compared to the quadtree mesh. This helps explain that the mesh structure produced by

the proposed method achieves more accurate local approximation to the original image.

Figure 9. Quadtree mesh generation procedure.

Mesh optimization

Next, a mesh optimization approach described in [6] is considered, details of witch

are given in Appendix B too. The basic idea of such an approach is to adjust the positions

36

of the mesh nodes so that the approximation error between the mesh representation ( )f p ,

defined in Eq.(2.1), and the original image ( )f p is minimized.

However, due to the nonlinear dependency of the interpolating basis functions in

( )f p on the nodal positions, no closed-form solution is readily available and an iterative

optimization approach such as a gradient-search type algorithm has to be employed (see

Appendix B for details). Such an approach often yields local minima of the error function

[7]. In our implementation of the method in [6], the interpolation error along with the

mesh regularity (or deformation) terms were used in the objective function.

In our experiment, the optimization algorithm was first used to further optimize the

quadtree mesh shown in Figure 13(a). The resulting mesh and reconstructed image are

shown in Figure 13(c) and (d), respectively. The reconstructed image has

PSNR=30.50dB, an improvement of 0.79 dB compared to the image in Figure 13(b).

Similarly, shown in Figure 13(e) and (f) are the resulting mesh and reconstructed

image, respectively, from optimization of the mesh in Figure 10(c) which was obtained

from the proposed method. The reconstructed image has PSNR=32.06dB, an

improvement of 0.93 dB compared to the image in Figure 10(d). More detailed numerical

results are summarized in Table 2. Also included in Table 2 are the results obtained using

the procedure described in our previous work [12] for the same “Lena” image, in which

the same number of mesh nodes N=3353 were used.

Interestingly, the mesh representation from our newly proposed method, shown in

Figure 10(c), still offers the best accuracy even when compared to the optimized results

from the other two methods, even though it does not directly attempt to minimize the

overall approximation error. Moreover, it is interesting to note that the results above also

37

demonstrate that the optimization algorithm is very susceptible to local minima. Indeed,

since all the meshes used in Figure 13 have the same number of mesh nodes, the

algorithm would have converged to the same solution had the true global minimum been

reached. The proposed mesh generation algorithm can help improve the performance of a

mesh optimization approach in this respect, because the resulting mesh nodes from our

algorithm are already well distributed in the image according to its local content.

2.1.7. Discussion.

In this portion of the work we presented a new, fast and effective algorithm for

generation of an accurate, compact mesh representation of an image that is well adapted

to its content. In the next chapter we apply the proposed technique in medical image

reconstruction for cardiac images and image sequences, where fast, compact image

representation can greatly help improve the quality of reconstructed images.

38

Table 1. List of results for various parameters

Direct interpolation LS fitMethod and /orparameters PSNR

[dB]epeakmean

epeakstd.

PSNR[dB]

epeakmean

epeakstd.

Figure 10 (default) 31.13 0.0238 0.0281 33.07 0.0214 0.0208γ = 1

2 29.68 0.0329 0.0352 31.15 0.0289 0.0269Raster scan order 30.50 0.0253 0.0281 32.78 0.0221 0.0211# nodes N=2081 27.37 0.0484 0.0432 29.97 0.0423 0.0315Simplified σ xa f 30.96 0.0239 0.0284 33.01 0.0215 0.0207

Table 2. List of results for various parameters and algorithms

Direct interpolation LS fitMethod and /orparameters PSNR

[dB]epeakmean

epeakstd.

PSNR[dB]

epeakmean

epeakstd.

Figure 10 (default) 31.13 0.0238 0.0281 33.07 0.0214 0.0208 Figure 10 optimal 32.06 0.0225 0.0285 34.68 0.0166 0.0171Quadtree 29.71 0.0329 0.0361 30.89 0.0296 0.0279Quadtree optimal 30.50 0.0316 0.0286 32.90 0.0108 0.0177Previous method 27.55 0.0354 0.0441 29.46 0.0308 0.0344Prev. meth. opti. 30.12 0.0316 0.0290 32.38 0.0172 0.0224

39

(a) (b)

(c) (d)

(e)

Figure 10. A 128 128× section of original image “Lena”; (b) feature map extracted fromthe image in (a); (c) mesh structure obtained using the proposed algorithm, 3353 meshnodes used; (d) image represented using the mesh in (c), PSNR= 31.13 dB; (e)reconstruted image using LS fitting, PSNR= 33.07 dB.

40

(a) (b)

(c) (d)

Figure 11. (a) Mesh structure obtained for 12γ = , same number of mesh nodes N = 3353

used; (b) image represented using LS fitting from the mesh in (a), PSNR=31.15 dB;(c) mesh structure obtained using the same feature map as in Fig. 3(b), but the numbermesh nodes reduced to N = 2081; (d) image represented using LS fitting from themesh in (c), PSNR=29.97dB.

41

(a) (b)

(c) (d)

(e) (f)

Figure 12. (a) 84 144× original cardiac image; (b) feature map directly extracted from theimage in (a); (c) feature map extracted using filtered gradients from the image in (a);(d) mesh structure obtained using the proposed algorithm, 554 mesh nodes used; (d)image interpolated using the mesh in (c), PSNR=38.38 dB; (e) image representedusing LS fitting from the mesh in (c), PSNR=40.57 dB

42

(a) (b)

(c) (d)

(e) (f)

Figure 13. (a) Mesh obtained using quadtree method; (b) image represented using themesh in (a), PSNR= 29.71 dB; (c) resulting optimized mesh from the mesh in (a); (d)image represented using the mesh in (c), PSNR= 30.50 dB; (e) resulting optimizedmesh from the mesh in Figure 10(c); (f) image represented using the mesh in (e),PSNR=32.06 dB;

43

Figure 14. Histogram plot of the peak error, epeak, computed from the mesh representationin Figure 10(d), and the quadtree mesh representation in Figure 13(b). The meshrepresentation from the proposed method has larger number of elements with lowererror level when compared to the quadtree mesh.

44

2.2. Scalar Volumetric (3D) Function Mesh Modeling

In this section we extend our content-adaptive mesh approach presented in section

2.1, for three-dimensional (3D) representation of volumetric images. An error bound is

derived for a 3D mesh representation of a volumetric image based on the theory of

function interpolation. From this result, a computationally efficient algorithm is proposed

for adaptive placement of mesh nodes (hence mesh elements) in the 3D image domain

according to the image content. Experimental results demonstrate that a highly compact

and accurate representation of volumetric images can be achieved at low computational

cost by the proposed algorithm.

2.2.1. Introduction.

For a volumetric image, a straightforward approach is to treat it as a collection of 2D

slices, then represent each slice independently by a 2D mesh model. Although it is

simple, such an approach would fail to exploit the redundancy among the different slices.

In this section we propose a fully 3D approach for content-adaptive mesh generation.

This approach treats a volumetric image as a function defined over the 3D image domain.

Based on the error bound for a 3D mesh representation, we design an algorithm that aims

to adaptively distribute mesh nodes (hence mesh elements) in the 3D image domain in

such a way that the error level achieved by the mesh representation is kept small over

individual elements. Specifically, the algorithm consists of the following three steps: 1)

generate a feature map that highlights the spatial distribution of the largest magnitude of

the 2nd directional derivatives of the image; 2) apply a 3D error-diffusion algorithm—

based on the well-known Floyd-Steinberg algorithm [81]—to distribute the mesh nodes

in the 3D image domain; and 3) use 3-D Delaunay triangulation [83] to compute the

45

mesh structure. This approach is fast, non-iterative, easy to implement, and proven to be

very accurate.

The proposed approach is an extension of our previous work in Sect. 2.1, wherein a

mesh representation was studied for 2D images.

There has been research on compression of volumetric images for the purpose of

efficient storage and transmission. While a compact, accurate mesh model may be

applicable for compression, our work is aimed principally at model based image

processing tasks. Our ultimate goal is to apply this model to fully 3D and 4D

tomographic reconstruction of volumetric images. Our previous work [90], shown latter

in Sect. 3.1, demonstrated that such an approach (with a 2D mesh) can outperform

several well-known reconstruction algorithms.


Mesh Representation

Let ( )f p denote a volumetric image function defined over a domain 3D R∈ ,

where ( ), ,x y z=p . In a mesh model, the domainD is partitioned into a number, say M ,

of non-overlapping mesh elements, which are denoted by , 1,2, ,mD m M= L . Then

( )f p is approximated over each element Dm as

( ) ( ) ( ),1

ˆ ,N

n m n mn

f f Dϕ=

= ∈∑p p p p , (2.27)

where ( ),m nϕ p is the interpolation basis function associated with the nth node np of Dm ,

and N is the total number of mesh nodes used.

The mesh representation in Eq. (2.27) assumes a form of signal representation based

on non-uniform sampling, in which the mesh nodes serve as image samples. In this study,

46

tetrahedrons are used for mD , so 4N = . Also, linear interpolation functions are used

for ( ),m nϕ p .

As in Sect.2.1 it can be rewritten in discreet form as:

ˆ =f Φn , (2.28)

where Φ is a interpolation matrix and n are the nodal values of the mesh model.

Error Analysis

In this section we establish an error bound for a mesh representation ( )f p of the

form in (2.27) to motivate the proposed mesh-generation algorithm. We simply state the

result below with proof given in Appendix C where fore more general case one can

consider [80].

Theorem 2. Let T denote a tetrahedron in the 3D space 3R , and let ( )f p denote a

real-valued function defined on T and its 2nd partial derivatives are continuous on T .

Assume that ( )f p is the linear interpolation of ( )f p at the vertices of T . Then for

each point T∈p the approximation error is bounded as follows

( ) ( )1.5

1.5323ˆ

8Mf f h − ≤

p p , (2.29)

where h is the length of the longest side of T , and M2 is the least upper bound on the

magnitude of the 2nd order directional derivative of ( )f p overT .


The result in (2.29) provides the basis for the proposed mesh-generation algorithm.

The theorem states that the approximation-error bound is proportional to two factors: (1)

the maximum magnitude assumed by the 2nd directional derivatives of the image function

47

over each mesh element (raised to the power of 1.5); and (2) the length of the longest side

of T (raised to the power of 3). Note that the latter is proportional to the volume of T

provided that it is not excessively elongated. Based on this observation, we argue that a

good mesh generation scheme should try to place small (in volume) elements in regions

of an image domain where its 2nd directional derivative is large, and conversely, larger

elements should be used in regions where the 2nd directional derivatives is relatively

small in order to achieve a balanced error level throughout the image domain.

As before, we propose the following three-step mesh-generation algorithm that aims

specifically to achieve this goal at low computational cost. First, a feature map ( )σ p is

extracted from the image ( )f p based on the largest magnitude of its second directional

derivative. Second, modified Floyd-Steinberg error-diffusion algorithm, a method

designed for digital halftoning [81], is employed to distribute mesh nodes non-uniformly

in the 3D image domain, with density proportional to ( )σ p . Third and final, a 3D

Delaunay triangulation algorithm [83] is used to connect the mesh nodes. The resulting

mesh consists of tetrahedral elements that are automatically adapted to the content of the

image. The details of these steps are further described below.

Feature Map Extraction

Let ( )G p denote the largest magnitude of the second directional derivative of ( )f p

at point p . The feature map function is defined as ( ) ( ) 1.5Gσ = p p . The following result

can be used to compute ( )G p .

48

Corollary 2. Let pH denote the Hessian matrix of ( )f p at p , and let ( )1,2,3λ p

denote the eigenvalues of pH . Then, we have

( ) ( ) ( ) ( ){ }1 2 3max , , .G λ λ λ=p p p p (2.30)

Content-Adaptive Placement of Mesh Nodes

The classical Floyd-Steinberg error-diffusion algorithm was intended for halftoning

of 2D images, where the objective is to use the spatial density of ink dots to represent the

image intensity. We generalize it to the 3D case in our implementation for placing mesh

nodes in accordance with the density specified by the feature map ( )σ p . The causal

neighborhood used for error diffusion is shown in Figure 15. The diffusion weightings,

for a neighboring voxel, are chosen to be inversely proportional to distance from the

current voxel and directly proportional to the 2nd gradient, of the function to be represent,

in the direction of diffusion. Discussion is given in Appendix D.

In addition to being fast, this algorithm allows us to easily control the number of

mesh nodes by adjusting the threshold value in the halftoning step.

Figure 15. The error at the center voxel is diffused into its causal neighboring voxels(labeled 1 through 13).

49

Delaunay Triangulation

Among its many useful properties, Delaunay triangulation yields a well-structured

mesh at reasonable computational cost. The use of Delaunay triangulation avoids

producing excessively elongated elements, thereby reducing the error bound in (2.29).

Mesh Nodal Value

Once the mesh structure is obtained, the image can be represented by interpolation

over each element from its nodal values. The nodal value ( )nf p in (2.27) can simply be

taken to be the image value at np . Alternatively, it can be determined using a least-

squares (LS) fit that minimizes the mean-squared-error of the interpolated image ( )f p .

One can refer to Sect. 2.1.4 for more details, where we have found the latter method to be

more accurate.

2.2.4. Experimental Results.

In this section we present results obtained by the proposed 3D mesh-generation

method. Two volumetric images were used. The first was obtained from one volumetric

frame of the four-dimensional (4D) gated mathematical cardiac-torso (gMCAT) D1.01

phantom [91], and the second was from the Zubal brain phantom [92]. The cardiac

phantom consists of 64 64 64× × voxels; the brain phantom consists of 64 64 20× ×

voxels. In Figure 16 one can see the slices of the gMCAT phantom in vicinity of the

heart.

The proposed algorithm was first applied to the cardiac volumetric image. For

comparison, we also implemented an octtree mesh-generation method. It is an extension

of the well-known quadtree method [6] to the 3D case. It starts with an initial coarse

mesh (e.g., the whole volume). At each step it successively refines those elements with

50

large approximation errors by subdividing each of them into eight smaller elements

(hence the name “octtree”).

In Figure 17, 3D mesh model for the proposed and octtree method are shown. For

clarity, only the distribution of the mesh nodes on the surface of organs is shown. One

should observe a good nodal concentration around organ boundaries.

Different slices of interpolated volumetric data, obtained by using the proposed and

octtree method are presented in Figure 18. As one can observe, from Figure 18(a) and (b)

compared with Figure 16, octree representation produces visually different results.

Further results of LS fit we show in Figure 19. Only 10,688 mesh nodes were used

(about 4% of the total number of voxels in the original volume). In Figure 19(a) slices

from the mesh representation by the proposed method are shown. To quantify the

accuracy of this mesh representation, we computed its peak-signal-to-noise ratio (PSNR)

to be 42.8 dB. The PSNR is defined as:

2max

210log dBˆ

M N L fPSNR

× × ⋅ = − f f, (2.31)

where f and f denote the original image and its mesh representation, respectively,

maxf is the image peak value, and M N L× × is the image dimension.

Shown in Figure 19(b) are the same slices from the mesh representation produced by

the octtree method, using the same number of mesh nodes as the proposed method. The

PSNR of the mesh representation is 39.4 dB. The octtree representation in Figure 19(b)

exhibits noticeable distortion around the outer heart-wall area in the images.

Similarly, the proposed method was tested using the Zubal brain phantom [92].

Again, the number of mesh nodes used was about 4% of the number of voxels in the

51

original volume. The proposed method yields a mesh representation with PSNR=30.8 dB,

while the mesh representation from the octtree method has PSNR=27.6 dB. Due to space

conservation the images are not shown in this thesis.

These results indicate that the proposed method can produce accurate 3D mesh

representations at very low computational cost.

Additional results, which are better displayed electronically, are shown at:

http://www.ipl.iit.edu/brankov/Rotate.htm. These results show that the mesh structure

produced by the proposed method is well adapted to the content of the volumetric

images.

2.2.5. Discussion.

Here we have proposed a fast mesh generation approach that can produce a compact

and accurate mesh representation of a volumetric image with only a very small number of

mesh nodes. We will apply, later in Sect. 3.5, the proposed mesh representation in fully-

3D tomographic reconstruction of cardiac images.

52

Figure 16. The four-dimensional (4D) gated mathematical cardiac-torso (gMCAT) D1.01phantom in vicinity of the heart (262,144 voxels).

53

(a) (b)

(c) (d)

Figure 17. 3D mesh model where, for clarity, only the distribution of the mesh nodes onthe surface of organs is shown; (a) proposed method; (b) octtree method; (c) zoomedimage in vicinity of the heart for the proposed method and (d) for the octtree method.Note a good modal concentration in the heart wall region for the proposed method.

54

(a) Proposed mesh representation

(b) Octtree mesh representation

Figure 18. Results of interpolation by using: mesh model obtained by proposed meshgeneration procedure (a) and using and octtree procedure (b). The same number ofmesh nodes 10,688 was used for both cases.

55

(a) LS fit; proposed mesh representation

(b) LS fit; octtree mesh representation

Figure 19. Results obtained by least square fit; (a) slices in a mesh representationproduced by the proposed algorithm, PSNR=42.8 dB; and (c) mesh representationproduced by the octtree method (PSNR=39.4dB).

56

2.3. Vector Valued 2D Function Mesh Modeling

Now let us explore a method for content-adaptive mesh representation of vector-

valued (e.g., color) images. The goal is to obtain a single mesh structure that accurately

represents all the individual components of the image. The proposed method is justified

by an error bound derived for such a representation. As before, it employs an error-

diffusion type algorithm to place the mesh nodes non-uniformly in the image domain

according to the image content. Experimental results demonstrate that: (1) a compact and

accurate representation for color images can be achieved at low computational cost by the

proposed algorithm; and (2) joint treatment of the different image components by the

proposed algorithm can result in a more accurate mesh representation than a mesh based

on a single image component (such as intensity) alone.


A critical issue in mesh modeling is how to determine the mesh structure in a mesh

model for a given image. Several approaches for mesh generation have been proposed in

image processing, almost all explicitly for scalar-valued images.

The purpose of this work is to study a method that can generate a compact and

accurate mesh representation of vector-valued images. A vector-valued image is defined

here as a signal that consists of two or more components defined over a common 2D

domain. Examples of vector-valued images include color images, multi-spectral images,

and multi-modality medical images (e.g.., CT/MRI). Our goal is to obtain a mesh

structure for a vector-valued image so that all the individual components of the image

can be accurately represented by the same mesh.

Toward this goal, an error bound, defined jointly for all the different components of a

vector-valued image, is derived for a vector-valued mesh representation. Based on this

57

result, we design an algorithm which aims to adaptively distribute mesh nodes (hence

mesh elements) in the image domain in such a way that the error level by the mesh

representation is kept small over individual elements.

The proposed approach is an extension of our work in Sect. 2.1, wherein mesh

representation was studied for scalar intensity images.

We point out that in the literature there exists work for compression of color images for

the purpose of efficient storage and transmission. While a compact, accurate mesh model

of a vector-valued image may be applicable for compression, our study is intended

principally for model-based image processing tasks. Our ultimate goal is to apply this

model to tomographic reconstruction of dual-modality medical images as in Chapter 4.


Mesh Representation

Let ( ) ( ) ( ) ( )( )1 2, , ,T

Kf f f=F p p p pL denote a vector-valued image function defined

over a domain 2D R∈ , where ( )kf p denotes its thk component,

1,2, , ,k K= L and ( ),x y=p . Let ( )F p denote the representation of ( )F p using a

common mesh, where the domain D is partitioned into a number, say M , of non-

overlapping mesh elements, which are denoted by , 1,2, ,mD m M= L . Then over each

element Dm each component ( )kf p of ( )F p is given by

( ) ( ) ( ),1

ˆ ,N

k k n m n mn

f f Dϕ=

= ∈∑p p p p , (2.32)

58

where ( ),m nϕ p is the interpolation basis function associated with the nth node np of mD ,

and N is the total number of mesh nodes used. Note that in (2.32) the same set of basis

functions is used for all the different image components.

The mesh representation in Eq. (2.32) assumes a form of signal representation based

on non-uniform sampling, where the mesh nodes serve as image samples. In this study,

triangular elements are used for mD , so 3N = . Also, linear interpolation functions are

used for ( ),m nϕ p .

Error Analysis

The key to the mesh representation in (2.32) is its accuracy. For a given number of

mesh nodes, our goal is to obtain a common mesh structure, defined by the mesh

elements Dm , so that it provides a compact, accurate representation of all the image

components.

Let us introduce an error metric between ( )F p and its mesh representation ( )F p . For

generality, define the error at each mD∈p using the -normp in KR as

( ) ( ) ( ) ( ) ( )( )1

1

ˆˆ .pK p

k kp k

e f f=

= − = − ∑p F p F p p p (2.33)

Then the following result can be derived:

Theorem 3. Let T denote a triangle in the 2D plane 2R , and let 2,kM denote the least

upper bound on the magnitude of the 2nd order directional derivative of ( )kf p over

T , 1,2, ,k K= L . Assume that ( )kf p is the linear interpolation of ( )kf p at the vertices of

T . Then for each point p ∈T

59

( ) ( )1

22,

1

14

pK pk

k

e M h=

≤ ∑p (2.34)

where h is the length of the longest side of T .

Theorem 3 follows directly from Eq. (2.33) and Theorem 1. (Eq. (2.2)).


The result in (2.34) provides a fundamental basis for the development of our mesh

generation algorithm. It states that the approximation error bound is proportional to two

factors: 1) the maximum magnitude assumed by the 2nd directional derivatives of the

image components; and 2) the square of the length of the longest side of T .

As in Sect. 2.1, we propose the following three-step mesh generation algorithm that

aims specifically to achieve this goal at low computational cost: First, a feature map

( )σ p is extracted from the vector-valued image based on (2.34). Second error-diffusion

algorithm [81] is employed to distribute mesh nodes non-uniformly in the image domain,

with density proportional to ( )σ p . Third and finally, a Delaunay triangulation algorithm

[83] is used to connect the mesh nodes.

Feature Map Extraction

Let ( )kG p , 1,2, ,k K= L , denote the largest magnitude of the second directional

derivative of ( )kf p at point p . The feature map function is defined as

( ) ( )( )1

1

.pK p

kk

Gσ=

= ∑p p (2.35)

To compute ( )kG p , one can derive the following:

Corollary 3. Let ,k pH denote the Hessian matrix of ( )kf p at p , and let ( ), , 1,2,k i iλ =p

denote the eigenvalues of ,k pH . Then, we have

60

( ) ( ) ( ){ },1 ,2max , .k k kG λ λ=p p p (2.36)

Content-Adaptive Placement of Mesh Nodes

The classical Floyd-Steinberg error-diffusion algorithm was originally intended for

digital halftoning, where the objective is to use the spatial density of ink dots to represent

the image intensity. We apply it here for placing mesh nodes in accordance with the

density specified by the feature map ( )σ p . Besides being fast, with this algorithm we

can easily control the number of mesh nodes by adjusting the threshold value of half-

toning.

Delaunay Triangulation

Among its many interesting properties, Delaunay triangulation is known to yield a

well-structured mesh at a reasonable computational cost. The use of Delaunay

triangulation avoids producing excessively elongated elements, thereby reducing the error

bound in (2.34).

Mesh Nodal Value

Once the mesh structure is obtained, the image can then be represented by

interpolation over each element from its nodal points. The nodal value ( )k nf p in (2.32)

can simply be taken as the image value at np . Alternatively, it can also be determined

using a least squares fit procedure such that the mean-squared-error of the interpolated

image ( )ˆ , 1,2, ,kf k K=p L , is minimized. The latter is found to be more accurate and is

adopted in this study.

61


In this section we present some results obtained by the proposed mesh generation

method using color images. Shown in Figure 20(a) is an original image, of size

128 128× . Shown in Figure 20 (b) is the resulting mesh obtained by the proposed

algorithm when applied to the RGB components of this image, in which only 2,340 mesh

nodes were used (about 14.2% of the number of pixels in the original image). In addition,

2p = was used in the error norm. Note that dense mesh elements have been

automatically placed in regions containing high-frequency features (such as edges and

textures), while coarse elements have been placed in relatively flat regains. The image

represented by this mesh is shown in Figure 20 (c).

To quantify the accuracy of this mesh representation, we computed its peak-signal-to-

noise ratio (PSNR) to be 30.2 dB. The PSNR is defined as

23

max2

1

10log dBˆk

k k

M N fPSNR=

× ⋅ = −

∑f f

, (2.37)

where kf and kf denote the original thk color component and its mesh representation,

respectively, maxf is the image peak value, and M N× is the image dimension.

For comparison, we also show in Figure 21 (a) and Figure 21 (b) the mesh and

interpolated image, respectively, obtained by the well-known quadtree method [6], using

the same number of mesh nodes. The image has PSNR=28.9 dB.

Furthermore, in Figure 21 (c) we show the obtained mesh structure based on the

intensity of the image alone using the proposed algorithm and Figure 21 (d) present the

image interpolated by using mesh from Figure 21(c). When compared to the mesh

structure in Figure 20(b), it is clear that in this mesh structure the color transition

62

boundaries between the hair and cheek area is not as well defined. As a result, the image

represented by this mesh suffers from color bleeding at these color boundaries. Due to

space limitation, this image and results obtained with other test images are not shown in

this work.

Further, the PSNR, of different mesh modeling approach, vs. the number of model

parameters (nodes) is shown in Figure 22(a). One could observe that least square fit

methods are outperforming pure interpolation methods. In addition the proposed method

which uses the L∞ norm outperforms other considered methods. The improvement over

the intensity based mesh modeling is in the range 0.2-0.5dB. Improvement over quadtree

mesh generation procedure is in range of 0.8-1.7dB.

Further the influence of the parameter p on proposed mesh modeling is shown in

Figure 22(b). Assumed p values (Eq. (2.35)) were [ ]1,2,∞ representing the 1 2,L L and L∞

norm, denoted in figure as L1_norm, L2_norm and inf_norm respectively. Results

presented here are reviling weak influence on PSNR.

Additional results are provided for better visualization at the website:

http://www.ipl.iit.edu/brankov/Color.htm. These results show that the mesh structure

produced by the proposed method is well adapted to the content of the color images.

2.3.5. Discussion.

We have proposed a fast mesh generation approach that can produce a compact and

accurate mesh representation of vector-valued images. We will extend the proposed mesh

representation in tomographic reconstruction of dual-modality cardiac images in Sect. 4.

63

(a) Original

(b) (c)

Figure 20. (a) Original image (128x128); (b) Mesh structure obtained using the proposedalgorithm based on the RGB components of the image in (a), 2,340 mesh nodes used;(c) Image represented by the mesh in (b), PSNR=30.2 dB;

64

(a) (b)

(c) (d)

Figure 21. (a) Mesh structure obtained by the quadtree method, same number of meshnodes as in Figure 20(b) used; (b) Image represented by the mesh in (a), PSNR=28.9dB; (c) Mesh structure obtained based on the intensity of the image in Figure 20(a)alone; (d) Image represented by the mesh in (c), PSNR=29.9 dB; . The imagerepresented by this mesh suffers color bleeding at color transition boundaries in theimage.

65

PSNR vs Ratio

26.5

28.5

30.5

32.5

34.5

36.5

20003000400050006000700080009000Number of mesh node

PSN

R [d

B]

quadtreeintensityproposedquadtree LS fitintensity LS fitproposed LS fit

PSNR vs Ratio

26.5

28.5

30.5

32.5

34.5

36.5

20003000400050006000700080009000Number of mesh node

PSN

R [d

B]

L1_normL2_norminf_normL1_norm LSL2_norm LSinf_norm LS

Figure 22. Comparison of considered mesh modeling algorithms by the way of PSNRquality measure plotted vs. the number mesh model parameters.

66

CHAPTER III

3. MESH MODELING FOR TOMOGRAPHIC IMAGE RECONSTRUCTION

Now we will explore the use of mesh modeling on tomographic image reconstruction.

3.1. CAMM based Reconstruction

As we showed, mesh modeling is an efficient and compact method for image

representation and is an effective tool for both rigid and non-rigid motion tracking in

image sequences. As a result, mesh modeling has recently found many important

applications in image processing, including image compression [3, 4, 93], motion

tracking and compensation [6, 7, 94-96], image processing through geometric

manipulation [97], and medical image analysis [98].

In this section we investigate tomographic image reconstruction based on a content-

adaptive mesh model (CAMM) developed in our previous work in Sect. 2, [99, 100]. The

CAMM is an image representation based on nonuniform sampling, in which the samples

(mesh nodes) are placed automatically so that their density varies spatially in relation to

the degree of local image detail. When using a CAMM image representation,

tomographic reconstruction can be performed by estimating the values of the mesh nodes

from the observed data.

The use of a CAMM for image reconstruction may have several potential benefits.

First, a CAMM is a compact image representation, i.e., an image can often be represented

using far fewer mesh nodes than pixels. Thus, a CAMM improves efficiency of

algorithms, and can help alleviate the inconsistent nature of the reconstruction problem.

Second, a CAMM provides a natural spatially-adaptive smoothness mechanism. Finally,

and perhaps most importantly, a CAMM serves as a natural framework for reconstruction

67

of moving image sequences, wherein mesh elements are allowed to deform over time.

We have demonstrated this capability in prior work, in which we used a CAMM for post-

reconstruction spatio-temporal smoothing of image sequences [101], this will be

presented in more details in Sect. 5.2. The purpose of this section is to establish a basic

framework for the proposed mesh-modeling approach for image reconstruction. Our goal

is to investigate the feasibility and benefits of this approach.

In the literature a great many methods have been developed for improving the quality

of reconstructed images in tomography (see [102] for a review). Most of these methods

are pixel-based, i.e., the image is represented and computed directly in a pixel basis (e.g.,

[26-28, 31-37, 40, 103]). Some methods based on object modeling have also been

described in the literature (e.g., [57, 58, 104-106]). These methods typically assume a

priori a geometric model of the object being imaged. For example, generalized cylinder

models were used in [57] for image reconstruction from incomplete projections;

parametric surface models were investigated in [58, 104]. These methods have a similar

philosophy to our proposed approach in that a model is used to combat the

underdetermined nature of the reconstruction problem. However, to our knowledge,

content-adaptive mesh modeling of images has not been used before as a basis for

tomographic image reconstruction.

The rest of the Chapter 3 is organized as follows. In Sect. 3.2 a mesh-model

framework for image reconstruction is introduced, and the mesh-domain imaging model

is derived. Reconstruction algorithms in the mesh domain, based on the expectation-

maximization (EM) algorithm, the ordered-subsets EM (OSEM) method, a weighted

least-squares (WLS) approach, and a maximum a posteriori (MAP) method, are

68

presented in Sect. 3.3. Experimental results and discussion for 2D reconstruction are

presented in Sect. 3.4. Finally, experiments for fully 3D reconstruction are discussed in

Sect. 3.5.

3.2. Mesh Modeling Framework for Image Reconstruction.

Let us start with detailed model description.

3.2.1. Mesh Representation of Images

Let ( )f p denote an image function defined over a domain D , which in our problem

can be two-dimensional (2D) or three-dimensional (3D), i.e., 2R∈p or 3R∈p . Using

this notation as in Sect. 2.1, the mesh representation can be extended over the whole

image domain D as follows:

1

( ) ( ) ( ) ( )N

n nn

f f eφ=

= +∑p p p p , (3.1)

where ( )nφ p is an interpolation basis function corresponding to node n. Consequently,

the support of ( )nφ p is strictly limited to elements attached to node n.

In our previous work, Sect. 2.1 or [99] we proposed a fast algorithm that can generate

a very accurate CAMM representation of an image. Herein we use this method as part of

an approach to tomographic image reconstruction.

Before concluding our introduction of the mesh model, let us introduce some notation

to facilitate subsequent development of the proposed method. Let n denote a vector

formed from the nodal values of the mesh model (the subscript “m” stands for “mesh”),

i.e.,

[ ]1 2( ), ( ), , ( ) TNf f f≡n p p pL , (3.2)

69

where the superscript T denotes transposition. Similarly, let ( )φ p denote a vector formed

from the interpolation basis functions, i.e.,

( ) ( ) ( )1 2( ) , , ,T

Nφ φ φ= φ p p p pL . (3.3)

Then (2) can be rewritten as

( ) m( ) ( )Tf e= +p φ p f p . (3.4)

Now let us relate the mesh-model representation to a conventional pixel

representation. Letting f denote a vector formed by lexicographic ordering of the pixel

values representing the image ( )f p , we rewrite (3.4) as follows:

= +f Φn e , (3.5)

where Φ is a matrix in which each row consists of the vector ( )Tφ p evaluated at a

particular pixel location in the image, and e is a similarly obtained vector representation

of the error ( )e p . Equation (3.5) represents the interpolation operation from a mesh

representation n to a pixel representation f .

3.2.2. Mesh Tomography Model

Now we frame the tomography problem in terms of the mesh model introduced

above. In tomographic imaging, the mean of the observed projection data can be modeled

by

[ ] ( ) ( )i iDE g h f d= ∫ p p p , 1,2, ,i S= K , (3.6)

where ( )ih x denotes the response of measurement i to an impulse at location p , and

[ ]E ⋅ is the expectation operator.

70

Our goal is to use a mesh model as a basis for estimation of ( )f p from a noisy

realization of the projection data. Thus we require a mesh-domain imaging model to

describe the data. Such a model is obtained by substituting (3.1) into (3.6) as follows:

1

[ ] ( ) ( ) ( ) ( ) ( )N

i n i n iD Dn

E g f h d h e dφ=

= + ∑ ∫ ∫p p p p p p p . (3.7)

Defining

, ( ) ( )i n i nDa h dφ= ∫ p p p (3.8)

and

ˆ ( ) ( )i iDe h e d= ∫ p p p , (3.9)

we rewrite (3.7) as

,1

ˆ[ ] ( )N

i i n n in

E g a f e=

= +∑ p . (3.10)

Now we construct vectors [ ]1 2, , , TSg g g≡g L and 1 2ˆ ˆ ˆ ˆ[ , , , ]T

Se e e≡e L containing all

the measured data and interpolation errors, respectively, and matrix ,[ ]i n S Na ×≡A

consisting of all the coefficients in (3.8).

Then, the mesh-domain imaging model becomes simply

ˆ[ ]E = +g An e . (3.11)

As we will demonstrate later, a CAMM can provide a very accurate image

representation, therefore the interpolation error e in (3.11) is negligible compared to the

imaging noise. Thus, neglecting e , we obtain a familiar linear imaging model in the mesh

domain:

[ ]E ≈g An . (3.12)

71

The mesh-domain system matrix A relates the observed data g to the mesh nodal

values n . In SPECT, detector sensitivity cannot be negative; thus, all the elements ,i na of

A are non-negative. Furthermore, it is evident from (3.8) that each element ,i na is

determined by two factors: 1) the response functions ( )ih p of the imaging system, and 2)

the mesh structure that defines the interpolation functions ( )nφ p . Implementation issues

related to A are discussed in Appendix E.

Based on (3.12) the reconstruction problem becomes that of estimating n from the

observed data g through the system matrix A . The image f can then be obtained by

using (3.5) (neglecting e ), as explained in the next section.

3.3. Image Reconstruction Algorithms using Mesh Modeling.

Note that the mesh-domain imaging model (3.12) has precisely the same form as the

conventional pixel-domain imaging model. The difference between the two

representations lies in the form of the basis functions, as illustrated in Figure 24.

Therefore, existing algorithms for image reconstruction can be used directly to solve

(3.12). In this study we consider maximum-likelihood (ML) [81], weighted least-squares

(WLS) methods [107], and maximum a posteriori (MAP) methods [91].

3.3.1. Maximum-Likelihood Solution.

ML estimation is based on solution of the following problem

( ){ }ˆ arg max log ;p=n

n g n , (3.13)

where ( );p g n is the likelihood function of g parameterized by n . In this paper, we

assume a Poisson likelihood, which characterizes emission tomography. Because the

mesh-domain imaging model is identical in form to the usual pixel-domain model, the

72

familiar form of the expectation-maximization algorithm [24] for the Poisson-noise case

[61], as well as the ordered-subsets EM (OSEM) algorithm [28], can be directly applied.

Mesh Domain EM Algorithm

The EM algorithm for this problem has the following iterative update [54] for the

estimates of the nodal values:

( )( 1) ,

( )1, ,

1 1

( )( )( )

j Sj i n in

n S Nji

i n i k ki k

a gffa a f

+

=

= =

=

∑∑ ∑

ppp

, 1,2, ,n N= L , (3.14)

where j is the iteration index. We refer to this algorithm as MESH-EM.

Because the MESH-EM algorithm in (3.14) has the same form as the familiar pixel-

based EM algorithm, it shares the same properties. In particular, the updated mesh nodal

values ( 1) ( )jnf + x in (3.14) always remain non-negative, provided that their initial

estimates are non-negative. This can be readily seen from (3.14) because all the

coefficients ,i na are non-negative. Another important property is that the total counts are

conserved by the iterates generated from the MESH-EM algorithm, just as they are in the

pixel-based EM algorithm. This is true because of the following identity:

( 1),

1 1 1

( )S N S

ji n n i

i n i

a f g+

= = =

=∑∑ ∑p , 0,1,2,j = L , (3.15)

which can be derived from (3.14) .

Ordered-Subset EM Algorithm

In an OSEM algorithm [55], the projection data g are divided into a number of

subsets, each containing multiple views. The update expression (3.14) is then computed

73

iteratively over one subset at a time. OSEM has become widely used because it leads to a

faster computation than EM [56].

Assuming that the projection data g are divided into subsets , 1,2, ,cS c C= L , then

the OSEM update for the nodal values is computed as follows over each subset:

( )( 1) ,

( ),,

1

( )( )( )c

c

jj i n in

n Nji Si n

i k ki Sk

a gffa a f

+

∈∈

=

=

∑∑ ∑pp

p, 1,2, ,n N= L , (3.16)

where j is the iteration index.

As mentioned above, the update in (3.16) is computed iteratively over one subset cS

at a time. One round of update over a subset is called a sub-iteration, and one pass

through all of the subsets is called an iteration. We refer to this algorithm as MESH-

OSEM.

3.3.2. Maximum A Posteriori (MAP) Solution.

Let ( )p n denote a prior on the unknown nodal values n . Then the MAP estimate is

obtained as:

( ) ( ){ }ˆ arg max log ; logp p= +n

n g n n . (3.17)

In this study we assume a Gibbs prior [61], i.e.,

( ) ( )~ expp Uβ− n n , (3.18)

where β is a scalar weighting parameter, and the potential function ( )U n is quadratic:

( ) 2

1

( ) ( )n

N

j nn j

U f f= ∈ℜ

= − ∑∑n p p . (3.19)

In (3.19), nℜ denotes the index set of nodes connected to node n .

74

The MAP estimate can be computed by using the following one-step-late expectation-

maximization algorithm [108-110]:

( )

( )( 1) ,

( ) ( )1, m ,

1 1

( )( )( )

( )

j Sj i n in

n S Nj ji

i n i k ki kn

a gffda U a f

df

+

=

= =

= +

∑∑ ∑

ppf p

x

(3.20)

We refer to this reconstruction algorithm as MESH MAP.

3.3.3. Weighted Least-Squares Solution.

Without specifically defining the noise distribution function, the weighted least-

squares (WLS) method [40] seeks a solution to (3.12) by mininimizing weighted squared

error, i.e.,

( ) 2ˆ arg min= −n

n W g An , (3.21)

where ⋅ is the Euclidean norm, and W is a weighting matrix defined according to the

noise level in the data g .

An advantage of the WLS method is that the objective function is of a quadratic form

in terms of the unknown n . This quadratic objective function has a unique solution,

provided that A is of full rank. In this study, we used the conjugate gradient algorithm

[88] to perform the optimization. We refer to this reconstruction algorithm as MESH-

WLS.

Note that the objective function is quadratic in terms of the unknown n, and a unique

solution exists (provided that the matrix A is of full rank) and it is given by:

( ) 1T T T TLS

−=n A W WA A W g .

75

3.4. Slice (2D) CAMM Reconstruction

Now we focus on a 2D implementation of the proposed method; however we have

will revisit a 3D reconstruction in Sect. 3.5.

As we discussed earlier, from the viewpoint of nonuniform sampling, the mesh nodes

should be placed most densely in areas of the image that contain significant details. In our

previous work, Sect. 2 and [99], we proposed an algorithm, based on a theoretical study

of the approximation error of the model, which specifically achieves this goal. The

algorithm yields very accurate image representations at extremely low computational

cost. In this study we employ our mesh-generation algorithm to construct a mesh model

for the image to be reconstructed.

In Sect. 2 we aimed to produce a good mesh structure for a known image. Of course,

here the image to be reconstructed is not known beforehand. Therefore, for the purpose of

mesh generation, we replace the image ( )f p with a reference image ( )f p% , the purpose

of which is to provide an estimate of the distribution of the local image content,

according to which the mesh nodes are then placed. A reference image can be obtained

from a preliminary reconstruction of the image using a simple algorithm such as filtered

backprojection (FBP). A further option, in multi-modality imaging [38, 39] such as

PET/CT, is to use the higher-resolution modality as the reference. This approach is

explored latter in Sect. 4 and [111].

Once the reference image ( )f p% is obtained, the mesh is generated by the following

procedure: 1) generate a feature map ( )σ p that represents the spatial distribution of the

largest magnitude of the 2nd directional derivatives of ( )f p% ; 2) apply an error-diffusion

76

algorithm; and 3) use Delaunay triangulation [83] to connect the mesh nodes. The details

of some of these steps are further described below.

3.4.1. Feature Map Extraction

The feature map ( )σ x is computed from the reference image ( )f p% as follows

( ) ( ) ( ) ( )2 2 2

2 2max , ,f f fx x y y

γ

σ ∂ ∂ ∂

= ∂ ∂ ∂ ∂ p p p p% % % , (3.22)

where 0γ > is a constant used to adjust the sensitivity of the mesh structure to edge

features in the image. The feature map ( )σ x in (3.22) is based on an approximation to

the largest magnitude of the second directional derivatives of ( )f p% at p . In practice the

reference image ( )f p% is available in a discrete pixel representation. Accordingly, the

feature map in (3.22) is computed using finite-difference approximations.

3.4.2. Determining the Number of Mesh Nodes

The accuracy of a mesh representation depends on the number of mesh nodes used,

leading to the question: What is the optimal number of mesh nodes for image

reconstruction? In this study we apply the minimum description length (MDL) principle

[107] [112], a well-known approach to model selection, to determine the number of

nodes. The MDL approach is to select, among many alternatives, the model that encodes

the reference image with the minimum number of bits. In our problem, the model is the

mesh representation in (3.5), which is specified by the mesh structure and the

approximation error.

Under the assumption that the approximation error e in (3.5) is independent and

identically distributed, zero-mean, and Gaussian, with unknown variance 2Nσ , we can

write the MDL objective function as

77

( ) 2 1ˆ ˆlog ( ; , ) log2N

NMDL N p Mσ += − +e n , (3.23)

where N is the number of mesh nodes, M is the total number of pixels in the reference

image, n and 2ˆNσ are ML estimates of n and 2Nσ , respectively, and 2ˆ ˆ( ; , )Np σe n is the

likelihood function of e . By assumption, the likelihood function of e is given by

( )

2

2/ 2 22

1( ; , ) exp22

N MNN

p σσπσ

− = −

f Φne n

%, (3.24)

where f% denotes the reference image ( )f p% in vector form. In (3.24) the dependency of

both Φ and n on N is suppressed in (3.24) for notational simplicity.

The ML estimates of n and 2Nσ are

21 2 1ˆ ˆˆ( ) , and T TN M

σ−= = −n Φ Φ Φ f f Φn% % ; (3.25)

thus, the MDL objective function in (3.23) can be written as

( ) ( )2 1ˆlog 2 log2 2 2NM M NMDL N Mπσ +

= + + . (3.26)

According to the MDL principle, the number of mesh nodes N is determined by

minimization of ( )MDL N . As we will show, for our application, the number of mesh

nodes specified by the MDL criterion is also the best for image quality, as judged by a

task-based numerical observer; therefore, MDL appears to be a good strategy for model

selection.

3.4.3. Computation of Mesh-Domain System Matrix

Once the mesh is obtained, the mesh-domain system matrix can be computed

according to (3.8). Recall that the interpolation function ( )nφ p has support only over

78

those elements attached to node n . Let nD% denote the support of ( )nφ p . Then the

integration in (3.8) reduces to

, ( ) ( )n

i n i nDa h dφ= ∫ p p p

%. (3.27)

Therefore, the quantity ,i na can be computed efficiently by taking advantage of the

fact that ( )nφ p has only limited support. Indeed, the computation in (3.27) can be further

expressed as integration over those individual elements attached to node n . As discussed

in the Appendix E, the integration over these individual elements can be simplified

through the use of a master element. When the analytical form of the response function

( )ih p is known, the integration in (3.27) can then be pre-calculated in a closed analytical

form. This, of course, can reduce greatly the overhead associated with computing the

system matrix A . In this study, we simply measured the system matrix by probing the

input with an impulse function. In future work, we will refine the method by developing

an analytic model.

Note that the system matrix can be obtained as =A HΦ where H is a classical pixel

based imaging matrix and Φ is an interpolation matrix defined before.


Simulation Data

The proposed CAMM-based reconstruction algorithms were tested using the 4D

gated mathematical cardiac-torso (gMCAT) D1.01 phantom [91], which is a time

sequence of 16 3D images. The field of view (FOV) was 28.8 cm. Poisson noise, at a

level of 4 million total counts per 3D time-frame image, was introduced into the

projections to simulate a clinical 99mTc gated cardiac-perfusion SPECT study. Our

79

experiments were based on a single slice (No.35) of the phantom, which has

approximately 2400 counts per frame. For each frame, the projections consisted of 64

bins at 64 views over 360o , yielding a total of 4096S = bins. Thus, there was an average

of approximately 0.5 counts per projection bin. The system had a blur of approximately 9

mm full width at half-maximum (FWHM) at the center of FOV. No attenuation

correction was used, and each image frame was reconstructed separately.

Mesh Generation

The mesh structure was estimated from the projection data using the procedure

described previously. The reference image ( )f p% was a smooth filtered-backprojection

(FBP) reconstruction of a sum of the 16 frames of data. In Figure 25(a) we show an

example of ( )f p% , consisting of 64 64× pixels, obtained from one particular noise

realization. This image was obtained using a filter with a bandwidth of 0.15 cycles/pixel.

In Figure 25(b) we show the resulting mesh structure, constructed from 819 mesh nodes.

In the mesh-generation procedure we used γ = 0.8 in (3.22). As it can be seen, the

algorithm automatically places mesh nodes densely in the important heart regions, and

most sparsely in the background.

In Figure 25(b) the mesh model uses about one-fifth as many nodes as projection

bins, thus it is a very compact representation. With 819 nodes, the approximation

accuracy of the mesh model, as measured by the peak signal-to-noise ratio (PSNR) was

found to be 42.8 dB, computed as in Eq. (2.26).

Next, we investigate the optimal number of mesh nodes for a mesh model based on

the MDL principle. For this purpose, the original image with different levels of blur,

measured by FWHM, was used as the reference image for mesh generation. This is used

80

to simulate the effect of the system blur. In this part of the experiments we used noise-

free images to determine the best-case scenario for mesh representation. In Figure 26 we

show a plot of the resulting MDL function versus the number of mesh nodes. According

to this plot, the MDL principle indicates that the best mesh model is obtained when the

number of mesh nodes is between 700 and 1000 for the case of FWHM=10 mm. (i.e.,

one-sixth to one-quarter of the number of projection bins), which is close to the average

system blur in this study.

Other Methods for Comparison

For comparison purposes, we also considered the following well-known

reconstruction procedures in this study: 1) filtered backprojection (FBP); 2) pixel-based

ML-EM reconstruction [24] (Pixel EM); and 3) a pixel-based MAP method [61] with a

spatial Gibbs prior [28] (Pixel MAP). We also considered the following two accelerated

versions of the pixel-based ML-EM algorithms: the ordered-subset reconstruction method

[54] (Pixel OSEM), and rescaled block-iterative EM [55] (Pixel RBI). Finally, we

considered a rescaled block-iterative MAP reconstruction algorithm [56] (Pixel RBI-

MAP). These accelerated algorithms were mainly used for comparing the execution

times, as they produce similar images to their non-accelerated counterparts.

The parameters used for the spatial Gibbs prior are 1α = , 0.1β = , 3δ = , 0.35γ =

as in [61] for both Pixel MAP and Pixel RBI-MAP. Note that here we are using the same

notation for these parameters as in [61]. This should not be confused with any other use

of these symbols elsewhere in this paper.

For the accelerated OS- and RBI-type methods, we used a total of 16 non-overlapping

subsets, each of which consists of projections along four mutually orthogonal directions.

81

Specifically, the first subset consisted of the following projections: { }0 ,90 ,180 ,270o o o o ;

the second consisted of { }45 ,135 ,225 ,315o o o o ; the third consisted of

{ }22.5 ,112.5 ,202.5 ,292.5o o o o , and so forth.

3.4.5. Performance Evaluation.

Evaluation Methods

We assessed image quality by measuring the detectability of cardiac perfusion defects

in the reconstructed images. For this purpose, we consider an image to be “good” if it

allows perfusion defects to be detected accurately. We used a numerical observer in place

of human observers to measure detectability. Specifically, we applied a channelized

Hotelling observer (CHO) [108, 109] to detect the presence of a simulated perfusion

defect. The CHO is a generalized likelihood-ratio detector, with input modeling the

human visual system. It produces binary decisions, i.e., “lesion is present” or “lesion is

absent” at the location of interest. In our implementation 16 input channels were used,

corresponding to four constant-Q frequency-bands with four orientations within each

band. The frequency selectivity of these input channels is illustrated in Figure 27.

The CHO was applied to reconstructed images of a modified version of the gMCAT

phantom having a simulated perfusion defect in the myocardium. The simulated defect

was generated in the first time frame as described in [110]. In Figure 28 we show an

image of slice #35 of this frame, in which the defect regions are indicated by arrows. The

CHO was applied to detect the region located in the interior wall (indicated by the right

arrow) in this slice. The 16 input channels of the CHO were centered at this low-intensity

spot.

82

The performance of each reconstruction method was summarized using the area

under the receiver operating characteristic (ROC) curve. In the ROC study, 200 noise

realizations of the reconstructed images were used: 100 with the defect present and 100

with the defect absent.

Evaluation Results

We summarize the numerical results obtained by MESH ML in Figure 29(a), where

the area under the ROC curve, denoted by zA , is plotted for different numbers of mesh

nodes and iterations. The case of 4096 mesh nodes corresponds to Pixel ML. As judged

by the CHO, the best detection performance is obtained when the number of mesh nodes

is between 600 and 1000 and the MESH ML algorithm is run for four iterations. This is

consistent with the optimum number of nodes determined earlier by the MDL principle

(Figure 26), suggesting that the MDL is indeed a good way to select the number of nodes.

In Figure 29(b) we show the results obtained by the reconstruction methods tested.

Whenever applicable, these results were obtained for the best parametric setting

(determined empirically) for each method. For MESH ML, the number of mesh nodes

used was 819; for MESH MAP, the number of mesh nodes used was 558, and 0.05β =

in (3.18); the parameters for Pixel RBI-MAP were as described earlier. These parameter

settings were also used in subsequent results. These results suggest that the best

performance was obtained by both MESH ML and MESH MAP when stopped at four

iterations.

In Figure 30 we show some reconstructed images of frame 1 (slice #35) by MESH

ML for different numbers of mesh nodes and different numbers of iterations (as in Figure

29(a)).

83

Finally, as a quantitative measure of the overall accuracy, we show in Figure 31 a

plot of the PSNR values of the reconstructed images versus the number of iterations for

different methods. For consistency, the FBP images were post-filtered by a lowpass filter

with a cutoff frequency of 0.35 cycles/pixel (which yielded the best PSNR results). The

best PSNR results were obtained by MESH ML around 15 iterations.

In Figure 32 we summarize the execution time for all the methods considered. The

abscissa in the plot represents the number of “effective iterations” by each algorithm, and

the ordinate represents the execution time (normalized by the time of one iteration of

Pixel ML). One “effective iteration” corresponds to one cycle in which every pixel in the

image is updated once. As it can been from Figure 32, the overall overhead for

computing both the mesh and the domain imaging matrix is equivalent to 2 units of

execution time. When compared after 24 effective iterations, the MESH OSEM is nearly

twice as fast as the fastest pixel-based method (Pixel OSEM); also, the MESH MAP is

almost as fast as the accelerated Pixel RBI-MAP. Note that the MESH MAP could also

be accelerated using RBI type algorithm (but this was not implemented in this study).

Finally, we mention that the MESH WLS method was not considered in the numerical

study owing to the low count nature of the data and the MESH ML and MESH MAP

methods are specifically adapted to the Poisson statistics of the data.

3.4.6. Discussion.

In this section we proposed a mesh modeling approach for tomographic image

reconstruction. In this approach we first model the image to be reconstructed by a

compact mesh representation. The problem of image reconstruction then becomes that of

estimating the parameters of this model. A key feature in this mesh model is that it uses

customized non-uniform sampling, in which samples are placed most densely in areas

84

that contain significant detail. The imaging model was then derived based on this mesh

representation, and the reconstruction algorithms were derived based on ML, MAP, and

WLS methods. The proposed reconstruction approach was evaluated for detection of

perfusion defects in cardiac gated SPECT images, where a ROC study was performed

using a channelized Hotelling observer. Our experimental results demonstrate that the

proposed approach outperforms several commonly used methods for image

reconstruction.

In this section only the 2D mesh model was used; we conjecture that the use of a fully

3D CAMM could offer even greater advantage for image reconstruction. This is explored

in next section.

85

(a) (b)

Figure 23. Mesh modeling of an image involves partitioning the image domain into acollection of non-overlapping (generally polygonal) patches, called mesh elements(left); the image function is then determined over each element through interpolationfrom the mesh nodes of the elements (right). The contribution of a node to the image islimited to the extent of those elements attached to that node. With a mesh model, onecan strategically place the mesh nodes most densely in regions containing significantfeatures, resulting in a more compact representation of the image than a pixelrepresentation.

(a) (b)

Figure 24. Illustration of a pixel model (left) and a mesh based model (right) for the caseof SPECT imaging. In a mesh model the contribution of mesh node j to measurementdata is spatially varying, while in a pixel model all the pixels play the same role. Thesupport of basis function ( )jφ x is limited to those elements attached to the node j .

86

(a) (b)

Figure 25. (a) The sum of 16 image frames (64 64× pixels) of one 2D slice obtainedfrom the 4D gMCAT phantom in one noise realization; (b) the mesh structure obtainedfor the summed image after low-pass filtering with cutoff frequency 0.15 cycles/pixel.A total of 819 mesh nodes were used. This mesh representation has PSNR=42.2 dB.

-8000

-7500

-7000

-6500

-6000

-5500

-5000

-4500

-4000

2048

1365

1024 81

968

358

551

245

541

037

2

Num ber of nodes

MD

L Fu

nctio

n

7m m10m m15m m

Figure 26. Plot of the MDL function vs. the number of mesh nodes for the referenceimage obtained from the summed image in Fig. 3 with different FWHM values.

87

Figure 27. Illustration of the 16 input channels, which consist of four constant-Qfrequency bands with four orientations within each band, used by the CHO in thefrequency domain. The four sub-images in each row are the frequency response of thefour bandpass filters within a frequency band.

Figure 28. Simulated perfusion defects (indicated by arrows) introduced in the gMCATphantom (slice #35).

88

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

4096 1365 819 585 455Number of Nodes

Az

46810203040

(a)

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

0 5 10 15 20 25 30 35 40Number of iterations

Az

Pixel MLPixel RBI-MAPMESH MLMESH MAP

(b)

Figure 29. (a) The area under the ROC curve, denoted by zA , obtained by the MESH MLalgorithm when different number of mesh nodes were used with different number ofiterations (see legend). The case of 4096 mesh nodes corresponds to the Pixel MLalgorithm. These results suggest that, as judged by the CHO, the best detectionperformance was obtained when the number of mesh nodes was between 600 and 900and the MESH ML algorithm was run for four iterations. (b) zA obtained by differentreconstruction methods, including Pixel ML, Pixel RBI-MAP, MESH ML, and MESHMAP.

89

Figure 30. Images reconstructed by the MESH-EM algorithms when different numbers ofmesh nodes and different numbers of iterations are used. Results are also shown forthe ML-EM algorithm.

90

21.5

22

22.5

23

23.5

24

24.5

25

25.5

26

0 5 10 15 20 25 30 35 40Number of iterations

PSN

R [d

B]

FBPPixel MLPixel RBI-MAPMESH MLMESH MAP

Figure 31. PSNR vs. the number of iterations for different reconstruction methods.

0

2

4

6

8

10

12

14

16

18

20

1 8 16 24 32 40Iterations (effective)

Tim

e (1

EM

iter

atio

n =

1)

Pixel OSL-MAPPixel MLPixel RBI-ML,RBI-MAPPixel OSEMMESH-ML,MESH-MAPMESH-OSEM

Figure 32. Computation time for various reconstruction methods. The abscissa representsthe number of “effective iterations” by each algorithm, and the ordinate represents theexecution time (normalized by the time of one iteration of Pixel ML). The overalloverhead for computing both the mesh and mesh domain imaging matrix is equivalentto 2 units of execution time. Mesh algorithms gain computational advantage after afew iterations.

91

3.5. Volumetric (3D) CAMM Reconstruction

Now we are focusing our efforts on developing content-adaptive volumetric mesh

model for fully three-dimensional (3D) tomographic image reconstruction. The proposed

methods are tested using gated cardiac-perfusion images. Initial results demonstrate that

the proposed approach achieves good performance when compared to several commonly

used methods for image reconstruction, and produces results very rapidly.


In recent years there has been growing interest in fully-3D tomographic image

reconstruction. A major challenge in fully-3D reconstruction lies in its memory

requirement and demanding computation time. Like their 2D counterpart, most 3D

reconstruction methods have traditionally been developed based on voxel image

representations [113]. Bayesian priors (e.g., [28]) or regularization terms (e.g., [40]) are

often used to combat the effect of noise.

In our previous work in [100] and Sect. 3.4, a content-adaptive mesh modeling

approach was proposed for 2D image reconstruction. It was demonstrated that such an

approach can outperform several well-known reconstruction algorithms in terms of both

reconstructed image quality and computation time. In this study, we extend this approach

to fully-3D image reconstruction. In this new approach, the image is first modeled by a

volumetric mesh model, on the basis of which a customized basis representation is

obtained for the image. The parameters of this representation are then estimated from the

data.

To remind the reader about the idea, we show in Figure 33 a 3D mesh model where,

for clarity, only the distribution of the mesh nodes on the surface of organs is shown.

92

Figure 33. 3D mesh model where, for clarity, only the distribution of the mesh nodes onthe surface of organs is shown.

3.5.2. Methods.

In this section we investigate maximum-likelihood (ML) estimate of the nodal values

in n , the values of mesh representation. The ML estimate is obtained as

( ){ }ˆ arg max log ;ML p= nn g n , (3.28)

where ( );p g n is the likelihood function of g parameterized by n . In this work, we

assume a Poisson likelihood, which characterizes emission tomography. The ML estimate

was addressed in Sect. 3.3.1

3.5.3. Preliminary Results.

Simulation Data

To demonstrate the proposed CAMM-based reconstruction approach, we used the 4D

gated mathematical cardiac-torso (gMCAT) D1.01 phantom [91], which is a time

sequence of 16 3D images. The field of view was 36 cm; the pixel size was 5.625mm.

Poisson noise, at a level of 4 million total counts per 3D time-frame image, was

93

introduced into the projections to simulate a clinical 99mTc study. No attenuation

correction was used.

Volumetric mesh generation

The key to the proposed approach lies in how to construct a CAMM that is compact

and accurate for representing the volumetric image to be reconstructed. For this purpose

we use method for 3D mesh modeling described in Sect. 2.2.

Of course, for tomographic image reconstruction the mesh structure has to be

estimated from the observed data. The following procedure was demonstrated to work

well in our studies. First, the projection data are summed over the 16 gated frames. From

these summed projections an image is reconstructed using the filtered back projection

(FBP) algorithm. The resulting image, denoted by ( )f p , provides a rough estimate of the

heart summed over all 16 frames. The mesh structure is then created based on ( )f p

using the steps described above. The resulting 3D mesh was shown in Figure 33, where,

for clarity, only the distribution of the mesh nodes (instead of the tetrahedral elements)

was shown. The number of mesh nodes used was 6,494 (comparing to 131,072 voxels).

As it can be seen, the mesh obtained by the proposed method is well adapted to the

content of the 3D volumetric image. Specifically, mesh nodes had been placed densely in

the important heart regions, and sparingly in the background. They are not shown here,

but are provided instead at the following web site in animations for better visualization

purposes: http://www.ipl.iit.edu/brankov/Rotate.htm.

The obtained mesh structure was then used as a basis on which each of the 16 3D

image frames in the sequence was reconstructed. In future work, we will optimize the

mesh to track motion from frame to frame.

94

Computation of Mesh-Domain System Matrix

Similarly as in 2D case, to obtain mesh domain system matrix, we were simply

measuring the system matrix by probing the input with an impulse function.

Reconstruction methods considered

In addition to the two proposed reconstruction algorithms, we also considered in this

preliminary study the following two well-known reconstruction procedures for

comparison purposes: (1) filtered back projection (FBP); and (2) pixel-based ML. To

help reduce the noise level, the results from these two methods were post-processed with

a 3D low-pass filter of order 17 with a cutoff frequency of 0.65 (normalized byπ ). For

consistency in the comparison, this same post-filtering was also applied to the proposed

mesh reconstruction method in the final results. Each of the iterative reconstruction

algorithms was run for 30 iterations.

3.5.4. Performance Evaluation.

For visual comparison, some representative 2D slices of frame #1, obtained by

different reconstruction methods, are shown in Figure 34. The images in Figure 34(a)

were from the original phantom, degraded by the intrinsic system blur. These images

represent the ideal case of noise-free projection data. The images reconstructed using

FBP are shown in Figure 34(b). The ML-EM results are shown in Figure 34(c), and the

MESH-EM results are given in Figure 34(d). The MESH-EM algorithm appears to

produce slightly better images, capturing the heart wall and achieving reasonable

smoothness in the background. The ML-EM algorithm produced similar results, but

slightly noisier than the MESH-EM. As a preliminary assessment of the accuracy, the

peak-signal-to-noise-ratio (PSNR), defined as in Eq.(2.31), was computed for the

95

reconstructed 3D images. The PSNRs of the reconstructed images for frame #1 by the

FBP, ML-EM, and MESH-EM are 17.69 dB, 21.46dB, and 22.22dB, respectively.

As for the execution time, the MESH-EM takes about 2.8 sec for one 3D frame, while

the ML-EM takes about 19.7 sec (implemented in MATLAB on a 2 GHz Pentium-4 PC).

Note that the MESH-EM requires an overhead of pre-computing the mesh-domain matrix

A in Eq. (3.12). In our implementation the total time for computing A was around 48

sec per frame (equivalent to about 2.5 iterations of ML-EM). This can be further reduced

in a more efficient implementation. Nevertheless, the effect of this overhead will

diminish after only 3 iterations as the MESH-EM is mush faster per iteration when

compared to the ML-EM.

3.5.5. Discussion.

These results, though very preliminary, indicate that the use of a CAMM in fully 3D

image reconstruction can achieve good image quality at low computational cost. We will

use more comprehensive evaluation metrics (e.g. bias-variance plot) to better characterize

the performance of the proposed technique in the future.

Our ultimate goal is to explore the use of a deformable mesh model for reconstruction

of image sequences, where the mesh structure in a CAMM is allowed to deform over

time which will be explored Sect. 5.2.

96

(a) (b)

(c) (d)

Figure 34. Representative slices of frame #1 reconstructed from different methods: (a)original phantom, degraded by the intrinsic system blur, (b) FBP with post-filtering,(c) ML-EM, and (d) proposed MESH-EM. For consistency, the same post-filteringwas also applied to the images in (c) and (d).

97

CHAPTER IV

4. DUAL MODALITY TOMOGRAPHIC IMAGE RECONSTRUCTION

4.1. Dual Modality Mesh Modeling

In this chapter we investigate a mesh-modeling approach for multi-modality image

reconstruction. In the proposed approach a mesh model uses information obtained from

an anatomical magnetic resonance (MR) image to aid in reconstruction of positron

emission tomography (PET) images. The aim is to improve spatial resolution and

quantitative accuracy of the PET image by using anatomical boundary information from

the MR image. The mesh approach accomplishes this by using spatially adaptive spatial

sampling and smoothing in the PET reconstruction. Our preliminary results demonstrate

that this mesh-based approach to multi-modality PET reconstruction can achieve good

results at low computational cost.

4.2. Introduction.

In this chapter we investigate a mesh-modeling approach for multi-modality image

reconstruction. In particular, we consider the use of a mesh model to utilize information

obtained from an anatomical MR image to improve the reconstructed image quality from

PET data. Specifically, the goal is to improve spatial resolution and quantitative accuracy

of the PET images, while respecting the differences that may exist between anatomical

and functional image boundaries. Rather than imposing boundary information from the

MR image onto the PET image, which may risk introducing false boundaries in the

reconstruction, the proposed method uses spatially adaptive sampling and smoothing in

an effort to allow, but not enforce, the development of edges.

98

In our previous work, Sect. 3.1 and [90], a content-adaptive mesh modeling approach

was proposed for two-dimensional (2D) image reconstruction. It was demonstrated that

such an approach can outperform several well-known reconstruction algorithms in terms

of both reconstructed image quality and computation time.

As already stated in a content-adaptive mesh model (CAMM), the mesh elements are

placed in a fashion that is adapted to the local content of the image. A mesh model of a

2D brain image [114] is shown in Figure 35.

Figure 35. Mesh structure (8,887 nodes) obtained from segmented MR image.

In the proposed approach, a CAMM is first established based on an anatomical MR

image represented on a fine pixel grid. This mesh model serves as the basis for a

customized basis representation of the image. The parameters of this image

representation are then estimated from the PET data.

Pixel-based methods have been proposed before for incorporating MR anatomical

priors to improve PET image reconstruction [26, 28, 38, 39, 115]. In [38], for example, a

prior distribution was explicitly defined to incorporate the anatomical data in a Bayesian

99

framework. Our proposed approach aims to achieve the same objective, but does so by

means of a content-adaptive mesh structure. In addition to the potential image-quality

advantage of the mesh approach, it provides a compact image representation (having

fewer unknowns), which can alleviate the underdetermined nature of the reconstruction

problem and the data storage requirement, and can also lead to a fast computation.

4.3. Dual-Modality Mesh Generation

Here we consider the use of a MR image for improved reconstruction of PET images.

We hypothesize that this may allow better reconstruction of boundaries in the PET image

by utilizing, but not enforcing, MR boundary information. This is accomplished in part

by increasing the spatial sampling rate of the PET image near anatomical image

boundaries identified from the MR image (see Figure 36).

Figure 36. Sampling strategy. The main philosophy is to use fine sampling nearanatomical boundaries to allow (but not force) functional images to have boundariesthere as well.

To obtain the mesh structure from the MR brain image we employ the following

steps. First, a 256x256 MR image is segmented into three pixel types: grey matter (GM),

white matter (WM), and cerebrospinal fluid (CSF), by a procedure described in [114].

Second, these segmented regions are assigned different gray levels: 0 for CSF, 1 for WM,

100

2 for GM, and 1 for the region outside the brain for reasons having to do with the mesh-

generation algorithm used. Here, in addition of the procedure described in Sect. 2.1., we

used a modified error diffusion kernel argued in Appendix D. Third, this mesh-generation

algorithm is applied to place the mesh nodes automatically in such a way that they are

arranged densely along the anatomical boundaries, but with few nodes elsewhere. Finally,

additional mesh nodes are placed in the interior of all the brain regions, using a lower-

resolution (128x128) sampling pattern than the MR grid (256x256). This is in

consideration of the fact that the PET image is expected to contain functional variations

within image regions that are uniform in the segmented MR image. The resulting mesh is

shown in Figure 35.

The described mesh generation method is equivalent to the vector valued function

representation assuming L∞ norm where for low-resolution PET data every pixel should

be represented.

4.4. Statistical Image Reconstruction

As we already established the mesh based imaging model is described as:

[ ]E ≈g An . (4.1)

The reconstruction problem becomes one of estimating the nodal values, n , from the

observed data in g . Here we used a maximum a posteriori (MAP) estimate of the nodal

values in n , which is obtained as

( ) ( )ˆ arg max log ; logp p= + nn g n n , (4.2)

101

where ( );p g n is the likelihood function of g parameterized by n , and ( )p n is a prior

on the unknown nodal values. We assume a Poisson likelihood, which is characteristic of

emission tomography. The prior ( )p n is described by the Gibbs distribution [28], i.e.,

( ) ( )( )~ exp Up β−n n (4.3)

where β is a scalar weighting parameter, and ( )U n is the energy sum of individual

nodal values:

( ) ( )2

, j n1

U n nn

N

n jn j

w= ∈ℜ

= −∑∑n . (4.4)

In (4.4), nℜ denotes the set of nodes connected to the nth node, and ,n jw are weighting

factors that can be chosen to be adaptive to the mesh structure.

The MAP estimate in (4.2) can be computed by using the following one-step-late

expectation-maximization (OSL-EM) algorithm [24, 30]:

( )n gn A

A nA Un

oldnew s ts ts old

old t tk kts old k

t s

β

= ∂ + ∂

∑ ∑∑ n, (4.5)

where nolds is the value of node s from the previous iteration, tg is the recorded count for

observation t , and tsA is the t,s entry of matrix A .

4.5. Experimental Results

4.5.1. Evaluation Data.

In our experiment a single slice from an anatomical MRI brain scan was used. The

image consists of 256x256 pixels of dimension 1 mm (Figure 37 (left)). The MR image

was segmented into three pixel types: grey matter (GM), white matter (WM), and

cerebrospinal fluid (CSF) by a procedure described in [114].

102

The segmented MR image was then used to generate a phantom for PET simulation

using relative activity levels of 4:1:0 for GM, WM, and CSF, respectively. The resulting

phantom is shown in Figure 37 (right). A 128x128 sinogram was simulated, at a level of

1M counts, using intrinsic resolution of 3mm full width at half-maximum (FWHM).

In addition, a mesh structure (shown in Figure 35) was obtained using the procedure

described in Sect 2.1 from the segmented MR image. This mesh structure was then used

to reconstruct the PET images.

4.5.2. Reconstruction Methods Considered.

The proposed method was studied with the following choices for the prior: 1) no prior

assumed (MESH ML); 2) a quadratic Gibbs prior with , 1n jw = in (4.4) (MESH MAP); 3)

a quadratic Gibbs prior with ,n jw set to 1/5 when j nn and n are nodes closer then 2mm,

and 1 otherwise (MESH MAP-W).

For comparison, the following pixel-based methods were also considered: 1) a

maximum-likelihood EM algorithm (Pixel ML); 2) OSL-MAP reconstruction with a

quadratic Gibbs prior (Pixel MAP).

Post-filtering was applied to both the pixel-based ML and mesh-based ML methods.

All the iterative reconstruction algorithms were run for 50 iterations.

4.5.3. Evaluation Criteria.

To evaluate the performance of the proposed algorithms, bias-variance curves were

computed for both the WM and GM regions in a selected region of interest of the

phantom, indicated in white in Figure 37(right).

103

4.5.4. Simulation Results.

For visual comparison, we show in Figure 38 some images obtained from the

reconstruction methods tested. The MESH MAP-W method (i.e., spatially-varying prior)

appears to produce slightly better images than the other methods.

Figure 37. Left: original MRI image; Right: simulated PET phantom generated from the

segmented MRI image. The highlighted region of interest (ROI) will be used in theevaluation procedure.

To quantify results we define bias and variance of ROI mean estimation as:

i iB E µ = − f ( )2, 1,2iV E E i = − =

f f , (4.6)

here [ ]E ⋅ denotes expectation over noisy realizations, f is an estimated average activity

over ROI and iµ is a true ROI mean activity. If 1i = , the ROI in question is gray matter;

otherwise it is white matter.

Bias-variance curves are shown in Figure 39 for all the methods considered. These

results indicate that the MESH MAP-W method offers the most accurate quantitative

results over a relatively large operating range.

104

The execution time of the proposed algorithms was similar to our previously reported

results [90]. The mesh based methods were faster than the pixel-based ones

approximately by an order of magnitude.

4.6. Discussion

The preliminary results obtained in this study suggest that the proposed mesh

approach is a feasible method for using an anatomical (MR) image to assist in

reconstructing a functional (PET) image. In future work we will refine the mesh

generation procedure so that the mesh structure is generated jointly from the MR image

and a preliminary reconstruction of the PET image. By this procedure it may be possible

to further improve the image quality, including the resolution recovery.

105

Original Pixel MAP

MESH MAP MESH MAP-W

Figure 38. Images obtained by three reconstruction procedures. From left to right:original phantom blurred to intrinsic scanner resolution; pixel MAP reconstruction(EM–OSL/quadratic); MESH MAP reconstruction (EM-OSL/quadratic); MESHMAP-W (EM-OSL/W quadratic) mesh-based reconstruct with varying smoothingparameters. In all algorithms the weighting parameter β in (4.5) was set to 0.005. TheMESH MAP-W image appears to exhibit a degree of resolution recovery not seen inthe other images.

106

12 13 14 15 160.26

0.28

0.3

0.32

0.34

0.36

0.38

0.4

0.42

Gray matter Bias vs. Variance plot

Bias

Var

ianc

e

MESH MAPMESH MLMESH MAP-WPixel MAPPixel ML

6.5 7 7.5 8 8.5 9 9.50.22

0.24

0.26

0.28

0.3

0.32

0.34

Var

ianc

e

MESH MAPMESH MLMESH MAP-WPixel MAPPixel ML

White matter Bias vs. Variance plot

Bias

Figure 39. Bias-variance curves, obtained from 56 noise realizations, for the methodsconsidered. The MESH MAP-W algorithm produces the lowest curve for both grey-and white-matter regions, indicating that it produces the best quantitative performancein this preliminary study.

107

CHAPTER V

5. SPATIAL-TEMPORAL IMAGE SEQUENCE PROCESSING

Due to their relatively high noise level, gated cardiac SPECT perfusion studies can

potentially benefit from appropriate 4D image processing (3D space + time). The basic

idea of the approach is to incorporate temporal smoothing to exploit statistical

correlations among the desired signal components of different image frames in a

sequence. In this chapter first we give an overview of relevant work. Further in Sect. 5.2

and 5.3 we explore the use of deformable CAMM in gated cardiac SPECT perfusion 4D

processing task. Specifically, for post-motion-compensated-filtering and motion-

compensated-reconstruction. Finally in Sect. 5.4 we propose a new temporal clustering

method that is based on similarity measurement, which can be developed further for 4D

data processing.

5.1. Introduction

Time sequences of images, as compared with static images, are typically obtained at

the expense of signal-to-noise ratio, thus noise reduction is particularly important for

image sequences. The problem of noise in image sequences can be addressed by applying

spatio-temporal image-processing techniques, which can have significant advantages in

performance over methods that process each image frame separately [62]. In the medical-

imaging field, spatio-temporal processing has become popularly known as four-

dimensional (4D) processing to reflect the use of three spatial dimensions plus time.

Therefore, we will use the terms “4D” and “spatio-temporal” interchangeably, although

the preliminary studies shown in this paper are based on a single slice of an image

sequence, so that we have only two spatial dimensions plus time.

108

4D processing is an example of multichannel image recovery, reviewed in a recent

book chapter [116]. The basic principle of multichannel image processing is to exploit the

statistical correlations between the desired signal components of different image frames

in a sequence, or other collection of related images, for purposes of noise reduction.

Methods of 4D processing have received increasing interest lately in the medical-

imaging field. Our group has proposed several 4D methods designed for reconstruction of

motion-free images, such as those obtained in dynamic PET studies (e.g., [67], [60]).

These methods have also shown to work well for gated cardiac SPECT ([117]). Lalush

and Tsui (e.g., [56]) have applied 4D image reconstruction to cardiac SPECT images, but

have not incorporated motion estimation explicitly in their techniques. Asma et al (e.g.,

[118]) have proposed a method of 4D PET reconstruction from list-mode data.

In the broader image-processing field, motion-compensated processing is a well-

known approach to reduce noise in image sequences [62]. In the nuclear medicine field,

Klein et al. (e.g., [63]) developed a motion-compensated summing method using motion

estimation, based on the optical-flow method (e.g., Vega-Riveros and Jabbour [64]), for

obtaining a single image from a gated PET study.

In this work we propose a motion-compensated spatio-temporal filtering method for

noise reduction of image sequences. Our intended application is gated cardiac SPECT

perfusion imaging. In the proposed method, the displacement-vector field describing

motion from frame to frame in the image sequence is described using nonuniform

sampling within a deformable mesh structure.

Figure 41 shows an example of a mesh structure, which is overlaid on an intensity

image for illustration purposes.

109

In our application, the purpose of estimating the displacement-vector fields for an

image sequence is to motion-compensate the images prior to application of a separable

spatio-temporal noise-reduction filter. The combination of motion compensation and

separable spatio-temporal smoothing is equivalent to spatio-temporal smoothing along

curved trajectories that follow the motion of the object through space-time (Figure 42).

As we will show, this helps not deadening of motion that results from smoothing along

the rectilinear time axis of an image sequence.

In the proposed method, we use a deformable mesh to track image motion. A

deformable mesh, which warps from frame to frame, is well suited to tracking non-rigid

elastic motion, such as that of the heart. Because it considers the motion in terms of

polygonal regions of the image, it potentially offers the additional advantage of being less

sensitive to noise than traditional pixel-based motion-estimation procedures.

In the work described here we are using a deformable mesh as a model of the

displacement-vector field, not as a model of the image intensity function.

Before describing the details of the proposed algorithm, we first review the concept of

deformable mesh modeling in the following section.

5.2. Deformable Mesh Modeling Approach

In this work we use a deformable mesh model as an alternative to conventional pixel-

based methods of motion estimation, such as optical-flow and pel-recursive methods

[62]. To begin the discussion of deformable mesh modeling let us ones more define the

mesh representation in general terms. Note here that the function to be interpolated is

expressed as a union of mesh elements rather than sum of the interpolation basis

functions associated with the nodes.

110

5.2.1. Mesh Model of a Function.

Let ( )f p be a function (vector-valued, in the case of our displacement-vector

modeling) of spatial coordinates p , defined on domain D . In a mesh model, the domain

is partitioned into M non-overlapping polygonal regions mD , 1, ,m M= K , called mesh

elements, the vertices of which are called nodes. Each node, indexed by n , has a location

np and an associated value nf called the nodal value. A mesh model is a representation

of the function ( )f p in terms of basis functions associated with the nodes. Within each

mesh element (for mD∈p ), the function is represented as:

( ) ( )1

,1

N

n m nn

f f φ=

= ∑p p , (5.1)

where ( ),m nφ p is the basis function contributing from node n to element mD , and 1N is

the number of nodes that are the vertices of mD . In our experiments, we chose the mesh

elements to be triangular, therefore 1 3N = , and we chose the basis functions to be linear.

5.2.2. Mesh Representation of the Displacement-Vector Field.

The aim of motion estimation is to determine the displacement-vector field (DVF)

( , ) ( )k ld p , i.e., the relative displacement between any point p in the current frame k and

its corresponding position in some other target frame l . The DVF, if accurate, should

allow the target frame to be well-represented by a motion-compensated version of the

current frame, i.e., ( )( )( , )k lkf +p d p should be a good match to ( )lf p .

In the proposed method, the motion estimation is accomplished by starting with a

mesh structure for the current frame k and deforming it to determine a mesh structure for

target frame l . Let { }( ) ( )1 , ,k k

k NP = p pK be the set of locations of the N nodes that

111

constitute the mesh for frame k , and let { }( ) ( )1 , ,l l

l NP = p pK be the set of corresponding

nodal positions for frame l . The nodal values of the mesh representation of the DVF are

given by ( , ) ( ) ( )k l l kn n n= −d p p , 1, ,n N= K . This vector represents the displacement of node

n from frame k to frame l . From these nodal values, the DVF can obtained for any

point x within mesh element mD as follows:

( ) ( )1

( , ) ( , ) ( ),

1

; ,N

k l k l kk l n m n

n

P P φ=

= ∑d p d p , mD∈p . (5.2)

Note that on the left-hand side of (5.2) we have made explicit the dependence of this

DVF representation on the nodal positions kP and lP in the current and target frames,

respectively. Also note that, on the right-hand side, the basis functions now have an index

k to indicate that the basis functions depend on the mesh structure of the current frame.

Examples of the nodal values ( , )k lnd and of a regular sampling of the DVF ( )( , )k ld p are

shown in top and bottom rows of Figure 43, respectively.

5.2.3. Deformation of The Mesh.

In this section, we describe a method of estimating a DVF, by deforming the mesh

from the current frame to the target frame. Specifically, given the nodal positions kP for

frame k , the goal is to determine their corresponding locations lP in frame l . We

accomplish this by using a variation of an algorithm proposed by Wang and Lee [6]. (see

Figure 40

112

Figure 40. Inter frame displacement estimation using deformable mesh model.

As we mentioned previously, the DVF should be such that a motion-compensated

version of the current frame is a good match to the target frame. Thus, deformation of the

mesh from the current frame to the target frame is achieved by minimizing a penalized

matching criterion of the following form to find the target-frame nodal positions lP :

( ) ( ) ( ) ( )1 21 12l l lJ P wE P w E P= + − , (5.3)

in which w is a parameter controlling the trade-off between matching error, measured by

1E , and mesh regularity, measured by deformation energy 2E . The matching error in (5.3)

is given by

( ) ( )( ) ( ){ }2( , )1

1

; ,m

Mk l

l k k l lDm

E P f P P f d=

= + − ∑ ∫ p d p p p , (5.4)

in which the mesh representation ( )( , ) ; ,k lk lP Pd p of the DVF is defined as in (5.2). The

matching error is small when the target-frame nodal positions provide a good match

between the target frame and the motion-compensated current frame. The deformation

energy in (5.3) is given by

113

( )2

( ) ( )2

1 n

Nl l

n jn j J

E= ∈

= −∑ ∑ p p , (5.5)

where nJ is the set of immediate neighbors of node n . This deformation energy serves

as a penalty term to prevent the mesh from becoming overly deformed and thus

overfitting the DVF. To understand how this term quantifies mesh regularity, it is helpful

to think of the inner summation as analogous to the net force applied by a collection of

springs connecting node n to its neighbors. If the neighbors are uniformly spaced, then

the “net force” on node n will be small because the difference vectors (and thus the

“forces”) will tend to cancel.

In our experiments, we minimized (5.3) by gradient descent method with fixed step

size; however, it may be preferable to use a line search to identify the optimal step size, at

each iteration.

5.2.4. Mesh Generation.

One final building block remains to be explained before proceeding to describe the

steps of the algorithm. In the previous section, we assumed that the current frame has an

associated mesh structure for the DVF, which will be deformed to find the mesh structure

for the target frame. To initialize the algorithm, we need a mesh from which to start. We

obtain this initial mesh using an image reconstructed from a sum of all the time frames of

projection data. The purpose of this is to obtain a low-noise image for obtaining a good

initial mesh. We assume that the structure of this image-intensity mesh is also a good

structure for representing the DVF. The basis of this assumption is that the features that

are important for representing the image intensity of an object (i.e., edges) are also

locations where object motion is most predominant.

114

For this task we employ the mesh generation procedure described in Sect. 2.1. Images

corresponding to the three generation steps are presented in Figure 44.

5.2.5. Modeling the Displacement-Vector Field.

In the previous sections we described a method of generating an initial mesh from an

intensity image and a procedure for determining the DVF by deforming the mesh of one

frame to match another. Now we describe an algorithm for applying these steps to obtain

a displacement-vector field for every image frame.

We begin by summing all the projection data (in our example, we have 16 frames of

data), then reconstructing the summed projection sequence to form an image. As

explained before, we next use this image to obtain an initial mesh representation by

applying the mesh-generation scheme described in Sect. 5.2.4. This initial mesh serves as

an approximate starting point for the algorithm.

Next, we obtain mesh structures and associated displacement-vector nodal values

( , )k lnd for all the other frames by sequentially deforming the mesh from one frame to

another by the method described in Sect. 5.2.3. Each deformation step produces a mesh

structure for the next (target) frame along with the displacement-vector nodal values

( , )k lnd for the current frame. Examples of these displacement-vector representations,

shown in Figure 43 (top row), are overlaid on the corresponding intensity images to help

visualize the spatial relationships.

Letting k l→ denote the step of deforming the mesh for frame k to match image

frame l , the sequence of deformation steps is described as follows. First, we progress

forward in the sequence from frame 1 (end diastole) to frame 8 (end systole) as follows:

1 2→ , 2 3→ , K , 7 8→ . Then, because a gated SPECT sequence is a loop (the first

115

and last frames (1 and 16) are actually immediate neighbors), we work backwards from

frame 1 to complete the sequence of mesh-modeled DVFs as follows: 1 16→ , 16 15→ ,

15 14→ , K , 10 9→ . The purpose of working backwards for half of the sequence is to

avoid overly long sequences of deformations which may produce cumulative errors. Such

an approach is standard in techniques like MPEG compression, which uses motion

compensation.

In our implementation, an additional step was required to eliminate a transient

response of the optimization process. We found that the initial deformation steps were

dominated by the deformation-energy term in (5.3), thus too little weight was given to the

matching term. To remedy this, we began the procedure by first working part of the way

through the sequence (1 2→ , 2 3→ , 3 4→ ) before starting again from the beginning.

This produced a more-regular starting mesh, and allowed the matching term to operate

more effectively. In the first few frames there is a little motion, so this approach worked

well; however, in other types of sequences, a better way to eliminate this transient

response might be preferable.

Having obtained a mesh structure and associated node displacement vectors for every

image in the sequence, we then convert the mesh-modeled DVF (Figure 43, top row) into

a pixel-based DVF (Figure 43, bottom row) by using (5.2). Note that the interpolation

basis of the mesh representation, along with the regularity constraint in the matching

criterion, led to smooth DVFs in the pixel domain.

5.2.6. Motion-Compensated Spatio-Temporal Filtering.

The DVFs derived as proposed above can be used to achieve spatio-temporal

smoothing that follows the curved trajectories of object points in space-time. Motion

116

compensation permits this, otherwise complicated filtering, to be done as a simple

operation that is separable into spatial and temporal parts.

In our experiments the spatial smoothing was achieved by a spatial lowpass filter,

which was an order-5 Butterworth filter with cutoff frequency of 0.6 cycles/pixel. The

temporal filtering was implemented for each pixel x in frame k by circular convolution

with a impulse-response (FIR) motion-compensated filter as follows:

( ) ( )( )( , )

1

ˆK

k lk k l l

l

f h f−=

= −∑p p d p , (5.6)

where coordinates x are pixel locations and the coefficients of the temporal FIR filter

jh are given by

1 2 | |1jjh

C K

γ = −

, 2 , , 2j K K= − K . (5.7)

In (5.6), the shift of the filter should be taken in a circular sense, since the gated

image sequence is a loop. In (5.7) γ controls the degree of temporal smoothing and C is

a normalization constant defined so that the filter has unity DC response. Filters that are

more optimal will be considered in future studies.


Data set

The algorithm was tested using the 4D gated mathematical cardiac-torso (gMCAT)

D1.01 phantom [91], and an imaging model incorporating depth-dependent blur, to

simulate gated SPECT perfusion imaging with Tc99m. The field of view was 36cm; the

pixel size was 5.625 mm. Poisson noise was introduced, at a level of 4 million total

counts for the entire sequence. In our experiments we used a single slice of the phantom

117

(slice 70) was used which was assigned a Poisson mean of approximately 45.5 10×

counts per time frame (for 16 frames). Attenuation was not included in the simulations.

Initial mesh generation

The initial mesh structure was computed using the method described in Sect. 5.2.4. A

total of 389 mesh nodes were used to form the initial mesh, which is shown in Figure 41.

This is about one-tenth as many pixels as are typically used to represent the same image.

Results

In this section we present results obtained by post-processing of images reconstructed

using the maximum-likelihood expectation-maximization (MLEM) algorithm [21]. For

comparison, the following methods were considered: (1) spatial-only filtering (“Spatial”),

in which an order-5 Butterworth filter with a cutoff frequency of 0.25 cycles/pixel was

applied to the reconstructed images; (2) the proposed motion-compensated spatio-

temporal processing method (“MC-ST”) with the same spatial filtering as “Spatial”,

except for the cutoff frequency which was 0.6 cycles/pixel, plus temporal filtering with

0.5γ = ; (3) the same spatial and temporal filters as MC-ST, except that motion

compensation was omitted (“ST”); and (4) “Ideal MC-ST”, in which we obtained the

mesh structures and DVFs from the original phantom.

The purpose of evaluating the ST method was to demonstrate that, while temporal

smoothing without motion compensation reduces noise, it can yield a dramatic

degradation of the representation of cardiac motion. The purpose of considering the Ideal

MC-ST method was twofold: first, it provides a best-case scenario for the algorithm, and

second, it gives an indication of the quality of the results that might be obtainable from a

dual-modality imaging system, by using a mesh obtained from the anatomical image

118

(CT) to smooth the functional image (PET or SPECT). We will revisit this dual-modality

approach in future work.

In Figure 45 we present some image results for visual evaluation. The images labeled

“Original” are taken from the phantom degraded only by the system blur to represent an

approximate best case image for comparison. The image results suggest that ST, MC-ST,

and Ideal MC-ST can significantly reduce the noise level in the images. However, the

images from the ST method (produced without the benefit of motion compensation)

suffer from significant motion distortion. This is evident when viewing the images as a

cine loop (available at http://www.iit.edu/~branjov/3D01.htm), but it can also be

measured quantitatively.

In Figure 46 we plot time-activity curves (TACs) for a small region in the left

ventricular wall vs. the frame number, for images obtained by the three methods. The

TAC obtained from spatial-only smoothing (Spatial) is very noisy, as is the visual

appearance of the cine display of the sequence. The curve for the ST method is relatively

featureless, corresponding to the deadened appearance of cardiac motion in the ST cine

display. The curves for MC-ST and Ideal MC-ST, though not perfect, are good, and

correspond to the relatively faithful representation of motion evident in their cine displays

when compared with the Spatial and ST sequences.

5.2.8. Discussion.

The studies presented here show preliminary evidence that motion-compensated

spatio-temporal filtering can be a useful tool for post-processing of image sequences

exhibiting motion. More evaluation studies will be required to validate the algorithm

conclusively. Specifically, we plan to evaluate quantitatively the effect of the algorithms

on ejection fraction measurements, perfusion-defect detection, and apparent wall motion.

119

It appears from the results of this study that failure to compensate for motion (in the ST

method) reduces frame-to-frame variation in the left ventricular volume, which we expect

will distort measurements of ejection fraction [2]. We will test this hypothesis

quantitatively in future work.

The artificial experiment (Ideal MC-ST), in which we obtained the DVFs from the

smoothed original phantom, suggested that an approach based on dual-modality imaging

might prove effective in ensuring an accurate representation of motion. The Ideal MC-ST

method is crudely suggestive of a realistic situation in which the DVFs could be

calculated from a CT image sequence, then used to smooth a SPECT or PET image

sequence. We will pursue this approach further in future work.

120

Figure 41. Mesh structure for representing the displacement-vector field describingmotion within a slice of torso, including the heart. The mesh structure is overlaid onthe corresponding intensity image for visualization purposes. In this mesh structurethere are 389 mesh nodes.

Figure 42. As the mesh deforms from frame to frame, the nodes trace out curvedtrajectories through the space-time coordinate system. In the proposed method, spatio-temporal smoothing follows these curved trajectories to avoid introducing motion blur.

121

Figure 43. Example displacement-vector fields in the vicinity of the heart, showingdisplacements from frame 9 to frame 8 (9 8→ , left column) and from frame 4 toframe 8 ( 4 8→ , right column). The displacement-vector fields in the top row arefound by mesh modeling, yielding only vectors at the nodal positions. The pixel-domain motion vectors in the bottom row are obtained by transforming the mesh-domain motion vectors using equation (5.2). Note that the motion vectors for 4 8→are shown to scale; however, those for 9 8→ have been lengthened by a factor of 2.7for visibility in the illustration. There are some nonzero motion vectors outside theheart because the mesh regularity condition imposes smoothness on the displacement-vector field.

122

(a) (b)

(c)Figure 44. Steps of mesh-generation method: (a) feature map determines required spatial

density of mesh nodes; (b) mesh node locations are determined by halftoning thefeature map in (a); and (c) a mesh structure is obtained by connecting the mesh nodesin (b) into a graph by Delaunay triangulation.

123

Figure 45. Frames from image sequences processed by various methods. “Original”denotes the phantom degraded by the system blur to represent an approximate best-case image for comparison. “Spatial” denotes spatial smoothing only; “ST” denotesspatio-temporal smoothing without motion compensation; “MC-ST” denotes theproposed spatio-temporal smoothing with motion compensation achieved usingmotion estimation obtained from noisy data; “Ideal MC-ST” is the same as “MC-ST”,except it is based on motion estimation computed from the original phantom images.All the images obtained using spatio-temporal smoothing look essentially the samewhen viewed as still frames; however, it is evident in a cine display failure to motion-compensate (ST method) dramatically deadens the motion of the heart. Spatial-onlysmoothing (Spatial), which is the standard clinical approach, produces inferior resultsin terms of both image quality and motion fidelity.

124

Figure 46. Time activity curves (TACs) for a small region in the left ventricular wall vs.the frame number. Note that spatio-temporal smoothing without motion compensation(ST) has a relatively featureless TAC, explaining the deadened appearance of cardiacmotion when viewing the ST sequence as a cine. When using spatial smoothing alone(Spatial), the magnitude of the remaining noise makes for poor image quality, and isdistracting when viewed as a cine. Motion-compensated spatio-temporal smoothing,whether achieved using a mesh obtained from noisy data (MC-ST) or from the originalphantom (Ideal MC-ST), allows motion to be captured more accurately, particularly atframes 8 and 9.

125

5.3. Deformable Mesh Modeling for Partial Volume Affected Myocardium

In this section, we propose a variant upon our previous deformable mesh model for

myocardial motion tracking presented in Sect. 5.2. Based on the new model, we propose

two reconstruction approaches. In the first, an inter-iteration mesh-based temporal filter is

embedded in an iterative reconstruction. In the second, a pixel-based motion-

compensated temporal filter is applied post-reconstruction, followed by a spatial filter.

Both methods are designed to preserve brightening of the myocardium as it thickens, a

partial volume effect in SPECT that is used in clinical practice as a diagnostic feature.

In the reconstruction algorithm proposed in this section, we represent the images and

account for motion in a gated SPECT sequence by way of a content-adaptive mesh model

(CAMM) (Figure 42), which is allowed to deform over time.

In this work we address a practical issue for clinical practice of gated cardiac SPECT.

While the intensity of the myocardial wall should ideally be constant in all frames, this is

not the case in actual SPECT images because of the partial volume effect, which leads to

the observed intensity varying linearly with wall thickness [119]. Since it is not possible

to overcome the partial-volume effect in this application, wall-thickening assessment is

usually done in clinical practice by looking for these intensity changes. 4D processing

can produce substantial improvements in image quality, but it can potentially wash out

the brightness variations used to assess wall thickening. In this paper, we aim to maintain

these clinically useful indicators of wall motion, while reducing the noise level in the

images.

126

5.3.1. Mesh Modeling for an Image Sequence.

Let ( )kf p denote the kth image frame of a gated study, defined over a domain D . In a

mesh model, the domain D is partitioned into M non-overlapping mesh elements,

denoted by , 1,2, ,mD m M= L . The image function is represented as;

( ) ( ) ( ) ( )1

ˆN

n nk k n k k

n

f f eϕ=

= +∑p p p p , (5.8)

where np is the nth mesh node, ( )nkϕ p is the interpolation basis function associated with

np , N is the total number of mesh nodes used, and ( )ke p is the modeling error, which is

negligible in comparison to the imaging noise. Further details of the mesh representation

can be found in [90].

Let kn denote a vector of nodal values of the mesh model, i.e.,

( ) ( ) ( )1 2, ,T

k k k k Nf f f= n p p pL . (5.9)

If kf denotes the voxel representation of the image function ( )kf p over D , then using

(5.8) and (5.9),

k k k=f Φ n (5.10)

where kΦ is a matrix, composed from the interpolation functions ( )nkϕ p in (5.8), that

forms the mesh-to-pixel interpolation operator. In the next section we will describe how

to estimate the frame-to-frame displacement of the mesh nodes, reflected in a different

kΦ for each frame.

For tomographic image-sequence reconstruction, the imaging equation can be written

in terms of the nodal representation kn as:

127

[ ][ ]k k kE =g H Φ n (5.11)

where kg contains the measured data, [ ]E ⋅ is the expectation operator, and H is a matrix

describing the imaging system. The reconstruction problem becomes that of estimating

kn from the observed data kg . The image kf can then be obtained from (5.10). In this

work we EM reconstruction in the mesh framework to perform the reconstruction [90].

5.3.2. Mesh Based Motion Estimation.

As we mentioned earlier, myocardial brightening is a useful clinical indicator of wall

thickening, and is routinely used to assess cardiac function. To preserve this effect, we

modified our previous mesh-element matching criterion as in Sect. 5.2 [76], which is

used to track motion between frames. Specifically, we weighted it with the relative

change of the mesh-element size from frame to frame as follows:

( )( ) ( ) ( )( )

2

1

12

m

kMm

k l k k l l lm mD

area DE f f d

area D→ →=

= + − ⋅

∑ ∫ p d p p p , (5.12)

where ( )k l→d p denotes the displacement vector at voxel p describing motion from

frame k to frame l. Based on the motion field computed using (5.12), we implemented

two reconstruction methods, described as follows.

5.3.3. Mesh Model Temporal Filtering.

In the first method, we introduced temporal smoothing into an iterative (EM)

reconstruction (Sect. 3.3.1) by incorporating a temporal filtering step between iterations.

Each frame is updated in independent fashion followed by a temporal filter.

The filter, which is also designed to preserve myocardial brightening, is defined as:

1

21 1K

l nk ln n

l k n

k lC K

γ

=

− = − ∑

An n

A(5.13)

128

where .n denotes the nth vector element, γ is a parameter used to control the degree of

temporal smoothing, C is a normalization constant defined so that the filter has unit DC

response, and ( )n

kk mn

m

area D∈ℜ

= ∑A with nℜ being a set of elements attached to node n.

5.3.4. Accurate Pixel Based Temporal Filter.

In the second method, we apply the following motion-compensated pixel-based

temporal filter after reconstruction:

( ) ( )( ) ( )( )( )1

21 1 1K

k l k l k ll

k lf f div

C K

γ

→ →=

−= − − ⋅ −

∑p p d p d p

)(5.14)

where ( )( )k ldiv →d p denotes the divergence of the motion vector field, with γ and C

defined as in (5.13). Additional details can be found in section 5.2.6.


The proposed algorithms were tested using the 4D gated mathematical cardiac-torso

gMCAT D1.01 phantom [91]. To evaluate the proposed procedures, motion estimation

was performed on a noise-free data sequence to investigate upper-limit of the algorithm

performance.

The reconstruction methods themselves were evaluated at a high noise level of 1M

counts per 3D frame. We compared the proposed reconstruction method with inter-

iteration smoothing (MCR+S+PVE) with a deformable mesh method that does not work

to preserve myocardial brightening (MCR+S). We also compared the proposed pixel-

based motion-compensated filter (ST-MC+PVE) with a version that does not preserve

myocardial brightening (ST-MC), and with a simple temporal smoothing that is not

motion-compensated (ST-NMC). The same spatial filter (order 5, cutoff=0.5) and

temporal filter (γ=1) were applied in each method.

129

4D reconstruction results for gated cardiac perfusion studies are difficult to

demonstrate in a printed paper. These images are used clinically for the express purpose

of wall motion and thickening assessments, which can only be seen in a cine display.

However, the results can be evaluated to some extent by quantifying the time activity

curves (TAC). The TAC for a small region in the left ventricular wall vs. the frame

number is shown in Figure 47(b). The MCR+S+PVE method produces a TAC closest to

original. This is further demonstrated by cross correlation between them shown in

Table 3.

5.3.6. Discussion

The effect of our reconstruction algorithms on the complex task of wall-motion and

thickening will require a human-observer study for absolute validation; however, a cine

display of the results clearly indicates the benefit of the proposed processing methods.

130

(a)

(b)

Figure 47. a) Mesh structure used in our experiment. b) Normalized time activationcurves for a small region in the left myocardial left wall.

Table 3. TAC cross correlation

ST-NMC ST-MC ST-MC+PVE MCR+S MCR+S+PVECrosscorrelation 0.7540 0.8240 0.8564 0.8734 0.9567

131

5.4. Similarity Clustering Analysis

Thought in this moment this work represents purely a new clustering algorithm, the

development was governed by the need of a method capable to identify regions in an

image sequence having similar time behavior. Identified regions can then be safely

processed in the same fashion (e.g. lower order KL approximation) in order to reduce

noise.


In this section we present a new method to determine distinct time-activity curves

(TACs) existing in an image sequence, consisting of M frames. Two TACs that differ

only by a scalar multiplication (and thus do not differ in shape) are not considered

distinct. Therefore, each distinct TAC corresponds to a unique direction, independent of

amplitude, in an M-dimensional space. Many traditional clustering algorithms, e.g., k-

means [72] or Gaussian mixture models [24, 73] are dependent on the signal amplitude

and not suitable for our purpose. Recently, clustered component analysis (CCA) was

developed in [75] to partially avoid the amplitude dependency of the conventional

algorithms. In this work, we present a new method for analyzing and identifying distinct

TACs in the image sequence (unconstrained by amplitude) which we call similarity

component analysis (SCA). This method depends on an explicit model of the feature

vectors the estimation of which is implemented through Bayesian parameter estimation.

The SCA method incorporates the expectation-maximization (EM) algorithm for

parameter estimation [24, 73]. Further, this approach can lead to a new reconstruction

method based on distinct time basis model in the temporal domain.

132

5.4.2. Models.

Let Yn represent an M -dimensional random vector representing the observed TAC

for the nth pixel. Furthermore, let E e e e= [ , , ]1 2 L K be set of K unique M-dimensional

vectors each with unit norm.

The data model can be described as:

α n X n nnE Xe Y= | , (5.15)

where α n is the unknown amplitude for the nth pixel and 1≤ ≤X Kn is the class label for

the TAC Yn . Also, it is assumed that class labels are independent and identically

distributed with P X k pn k= =l q and pkk

K

=∑ =

1

1. Our objective is to estimate the parameters

of the proposed model: the class label Xn , the prior probabilities pk and the distinct

directions ek .

Similar Component Analysis (SCA)

By assuming an independent, identically-distributed Gaussian noise model, we can

rewrite our basic model as

Y e Wn n X nn= +α , (5.16)

where E n nTW W I= σ 2 and ek = 1.

We define the similarity measurement as the cosine of the angle between two vectors,

i.e.,

cos β nn X

n

nb g = Y eY

T

. (5.17)

133

In 2D it can be shown that the distribution of the angle between the eXn and observed

vector Yn , denoted β n, follows a Rician probability density function (PDF) [120] (more

details can be found in [121] and [122]). The Rician PDF is described as follows:

( ) ( ) ( ) ( )( )2cos1 1 1| ; 1 4 cos 2 cos2 2 2

n nn AArice n n n n n np e A e erf Aββ θ π β β

π− = + +

X ,(5.18)

where Ann

n

=ασ

2

22 and θ = [ , , ]E P A , with P = [ , , ]p pK1 L and A = [ , , ]A An1 L .

In spite of the fact that a Rician PDF of the phase is shown to be valid in the 2D case,

in this work we assume its validity for the distribution of the angles defined by Eq. (5.17)

as well. We have not yet verified this conjecture, but it will be addressed in future work.

Figure 48. Rician probability density function (PDF) of the phase distribution,parameterized by the SNR.

134

Further, we assume that the noise power is much less then the signal power

(α σn n2 2>> ), i.e., the signal-to-noise ratio (SNR) is large. By exploiting this assumption

one can simplify Eq. (5.18) (as has been shown in [120]) as follows:

p An n nβ θ β| ; ~ expXb g c h− 2 . (5.19)

By simple algebra and the assumption that α σn n2 2>> one can obtain alternative

approximation which is more suitable:

p An nn

n nβ θγ

β| ; exp cosXb g b gc hd i= − −1 2 1 , (5.20)

where ( )( )( )exp 2 1 cosn nA dπ

πγ α α

−= − −∫ .

Note that the pfd in (5.20) has the same form as Von Mises distribution, with 2 nA

being a concentration parameter [123]. Von Mises distribution can be derived as wrapped

normal onto the sphere.

The two graphs in Figure 49 assess how well Eq.(5.20) approximates Eq.(5.18).

(a) (b)

Figure 49. (a) Approximation error vs. the SNR and the angle; (b) MSE of the PDFapproximation.

135

Note that as the SNR increases as the approximation becomes more accurate. This

enables us to use the approximation instead of the complicated true Rician PDF.

In the EM framework one needs to define the sufficient data statistic, named the

complete data, which define uniquely all the model parameters. In our model those are

the observed angle β data and the class labeling X .

Now we can rewrite the PDF of the complete data [ ],β X as:

( ) ( ) ( ) ( ) ( )1 1 1 1 1

1

, ; | ; ; | ; ;N

i i i i in n n

n

p p P p X P Xβ θ β θ θ β θ θ− − − − −

=

= =∏X X X , (5.21)

with [ ]1, , nβ β β= L .

The expected log-likelihood function, required by the EM framework is given by:

( ) ( ) ( ) ( )( )1 1 1

1 1

; | ; log 2 1 log ;nK N

n Xi i i in n n n n

k n n

eQ p X k A P X kθ θ β θ γ θ− − −

= =

= = − − − + = ∑∑

TYY

(5.22)

From Eq. (5.22) one can derive the E (expectation) and M (maximization) steps of the

EM algorithm as follows (the details are given in Appendix H):

E- STEP:

( ) ( ) ( )( ) ( )

1 11

1 1

1

| ; ;| ;

; | ;

i in n ni

n n Ki i

n n nk

p X k P X kp X k

P X k p X k

β θ θβ θ

θ β θ

− −−

− −

=

= ⋅ == =

= =∑, (5.23)

M-STEP:

( )1; i kn

nP X kN

θ −= =

( )1

1

| ;N

ik n n

n

n p X k β θ −

=

= =∑ , (5.24)

( )1

1

1 | ;N

ik n n n

n

p X kW

β θ −

=

= =∑e Y , (5.25)

136

where the value of W is derived explicitly in Appendix H, but in essence W is equivalent

to a normalization constant to ensure that ek = 1 and

Ap X k

p X k En

n ni

k

K

n k

n ni

k

K

n n k n n k

==

LNM

OQP

= − −

−

=

−

=

∑

∑

| ;

| ;

Y Y e

Y Y Y e Y Y e

T

T

θ

θ

1

1

2

1

1

2

c h

c h c h c h. (5.26)

The E and M steps are repeated until, between successive iterations, the change in

value of the increasing log-likelihood function (Eq.(5.22)) falls below a prespecified

threshold or a fixed number of iterations is reached. The latter approach was used in this

work.

After an algorithm was run for a fixed number of iterations (usually 30) the label

assignment was performed. Here, instead of classifying the Yn into class k stochastically,

according to probability p X kn ni= −| ;β θ 1c h , we classified according to a maximum

a posteriori (Bayesian) decision. That is, we let X p Xn X n ni

n

= −arg max | ;β θ 1c h .

Throughout the rest of this thesis we denote this algorithm as SCA.

Winner-Take-All Similar Component Analysis (wtaSCA)

Under the assumption that the An →∞, which is equivalent to σ 2 0→ , and pk K= 1 ,

Eq.(5.23) becomes an indicator function defined as:

I X k k p X k e

otherwisen n

ik n n

i

k

n k

n= = = = =

RS|T|

−−

| ; , arg max | ; arg max

,Y Y Y

Y

T

θ θ11

2

1

0c h c h . (5.27)

By use of the indicator function Eq. (5.27) in Eq. (5.24)-(5.26), instead of Eq. (5.23)

one can obtain the algorithm we refer to as the as Winner-Take-All similarity clustering

denoted as wtaSCA.

137

5.4.3. Similarity Component Analysis Experiments.

Evaluation data

In this study, to evaluate the performance of the proposed method, a single slice

(No.70) of the Zubal brain phantom [92] was used to simulate a dynamic study of [11C]

carfetanil binding to µ-selective opiate receptors. A four-compartment and a three-

compartment tracer kinetic models were used to produce time activity curves (TACs) for

various brain regions. The model used parameters derived from the data in [124] and an

input function, blood concentration of tracer, obtained in an actual study. We simulated

23 image frames with a total of four million counts. The pixel size was 4.36mm/pixel and

blur had FWHM of 8mm.

Figure 50. Six brain regions used in our simulation.

In our study we distinguish six different regions in the brain (Figure 50): Thalamus,

Caudate, Front Cortex, Temporal Cortex, White matter and Occipital Cortex. For each of

them the kinetic model was estimated using the real PET study data. The activity ratio for

each brain region/thalamus was made as measured by Frost, et al., [125] taking an

138

average for two given subjects (see Table 4). More details about the model used can be

found in [126].

Table 4. Regions/Thalamus Ratio

Brain region Three compartment Four compartment

Thalamus - 1Caudate - 0.87

Front Cortex - 0.805Ant. Temporal +Sup. Temporal

Cortex

- 0.805

White matter - 0.1Occipital Cortex 1 -

Note that only the occipital cortex has a distinct time activity curve. It is desirable to

have an algorithm that can clearly identify that region.

Prior to the classification, in order to reduce the noise level, a low-pass filter was

applied. For that purpose we used a 3rd order spatial low-pass filter with cutoff frequency

of 0.5 cycles/pixel.

Other methods considered

In addition to the two proposed clustering algorithms, we also considered three well-

known clustering procedures for comparison purposes: a) Gaussian mixture model

(GMM) parameter estimation; b) k-means (also called the Linde-Buzo-Gray (LBG)

algorithm [127] or generalized Lloyd algorithm (GLA)); and c) clustered component

analysis (CCA) proposed by Bouman et al. [75].

Let us describe briefly each of these algorithms.

139

Gaussian mixture model parameter estimation

In this model the data were modeled as a collection of random variables obtained

from K different Gaussian random process.

Y m Wn X nn= + or p pn k

k

K

k k( ) ,Y m C==∑

1

N b g (5.28)

Under the given assumptions the E and M steps can be shown to be:

E- STEP:

p X kp X k P X k

P X k p X kn n

i n ni

ni

ni

k

K

n ni

= == ⋅ =

= =

−− −

−

=

−∑| ;

| ; ;

; | ;Y

Y

Yθ

θ θ

θ θ

11 1

1

1

1c h c h c h

c h c h, (5.29)

M-STEP:

P X k nNn

i k= =−;θ 1c h , with n p X kk n ni

n

N

= = −

=∑ | ;Y θ 1

1c h , (5.30)

m Y Ykk

n ni

n

N

nnp X k= = −

=∑1 1

1

| ;θc h , (5.31)

C Y Y m Y m Tk

kn n

in k n k

n

N

np X k= = − −−

=∑1 1

1

| ;θc hb gb g . (5.32)

As before, the E and M steps are repeated for 30 iterations then the label assignment

was performed as a maximum a posteriori decision, i.e., X p Xn X n ni

n

= −arg max | ;β θ 1c h .

In the rest of this work we denote this algorithm as GMM.

Winner-Take-All Gaussian mixture model classifier

If we assume C Ik = →σ 2 0 and pk K= 1 the algorithm derived above converges to

classical k-means algorithm. The indicator function, which is used instead of Eq. (5.29),

is given by:

I X k k p X kotherwise

n ni

k n ni

k n k= = = = = −RS|T|−

−

| ; , arg max | ; arg min,

Y Y Y mθ θ11 2

10

c h c h b g .(5.33)

We will refer to this algorithm as k-means.

140

Clustered Component Analysis

Briefly, we review here the main steps of a method presented by Bouman et al. [75].

This method can be seen as a special case of probabilistic principal component analysis

(PPCA) mixture model parameter estimation [128].

The assumed data model is:

Y e Wn n X nn= +α or p e e pk k k

k

K

I YT− ==∑c hd i a f

1

0 1N , , (5.34)

with:

p X ke e e e

n n M

k k n k k nYI Y I YT T T

| ; exp= = −− −F

HGG

IKJJ−

θπ

b g c hd i c hd i12 21

. (5.35)

Under the given assumptions the E and M steps can be shown to be:

E- STEP:

p X kp X k P X k

p X k P X kn n

i n ni

ni

n ni

ni

k

K= == ⋅ =

= =

−− −

− −

=∑

| ;| ; ;

| ; ;Y

Y

Yθ

θ θ

θ θ

11 1

1 1

1

c h c h c hc h c h

(5.36)

M-STEP:

P X k nNn

i k= =−;θ 1c h , where n p X kk n nn

N

= ==∑ | ;Y θb g

1

, (5.37)

R Y Y Y Tk l n n n

n

N

p X k= ==∑ |b g

1

, (5.38)

( )PrincipalEigvectork ke = R . (5.39)

As before, the E and M steps are repeated for 30 iterations with subsequent label

assignment.

In the rest of this thesis we refer to this algorithm as CCA.

141

Results

To quantify the results of the first experiment table with percentages of correct

classification per class is provided (see Table 5). It is clear that the good algorithm should

classify all three clusters equally well. Therefore the average percentage of correct

classification over all classes is presented too. One can see that the proposed wtaSCA and

SCA outperform other methods significantly. The difference between other best

algorithms is more than 13%.

For visualizing the obtained results we provide class label assignments for described

algorithms, under the assumption of three existing classes (see Figure 51). We do not

show the estimated distinct TAC, but it is clear that the algorithm having better class

label identification should have a better estimate of the TAC.

One can clearly observe the poor performance of the GMM and k-means algorithms

for this application. Also the algorithm denoted as CCA has difficulty in distinguishing

the background region which has a small intensity. Further results are given in Figure 52

in order to more closely examine the properties of the proposed algorithm. In this

experiment four classes were assumed. One can make a similar observation as in the

previous experiment. Clearly, the proposed algorithms have identified each region more

accurately. Moreover, the wtaSCA algorithm does so with very low computation

complexity. The SCA algorithm demands more computational power then wtaSCA,

however this requirement is still much less then for the CCA and GMM algorithms. CCA

requires the computation of the principal eigenvector of Rk (Eq. (5.38)) at each iteration

step. Similarly, in the GMM estimation, a computational burden arises from computing

the det Ckb g and Ck−1 at each iteration (Eq. (5.29)).

142

5.4.4. Discussion.

Results presented here are preliminary and are intended to demonstrate the feasibility

of the proposed similarity component analysis concept. The results obtained indicate the

ability of the proposed algorithms to identify regions with distinct TACs. Among the

tested methods, the proposed algorithms have the best accuracy and lowest computational

complexity.

143

Table 5. Percentage of correct classification.

Occipital Cortex Rest of the Brain Background Averagek-mean 39.9% 55.6% 92.4% 62.6%GMM 0.0% 100.0% 37.8% 45.9%CCA 100.0% 44.9% 93.4% 79.4%

wtaSCA 100.0% 93.6% 87.6% 93.7%SCA 100.0% 94.3% 87.6% 94.0%

144

Figure 51. Class labels obtained by different algorithms applied to the simulation data set.The number of assumed class for all methods was three. Note that a better class labelidentification by the proposed wtaSCA and SCA methods.

Figure 52. Class labels obtained by different algorithms. The number of assumed classfor all methods was four.

145

CHAPTER VI

6. EXTENSION TO OTHER IMAGE PROCESSING PROBLEMS

As we showed CAMM is a powerful tool in image processing. CAMM gain

advantage mainly because it is compact representation, meaning that it performs a

dimensionality reduction while maintains an accuracy of representation. Therefore the

natural question is can it be used in image de-noising. This we addressed in next section.

Further a use of deformable CAMM in correction for geometric bending attack in digital

watermarking is explored in Sect. 6.2.

6.1. Image Denoising by Mesh Model Filtering

In this section we demonstrate that the proposed mesh generation scheme also works

well when the image is corrupted by noise. The additional implementation details we

already addressed in Sect. 2.1. Shown in Figure 53(a) is the “Lena”' image corrupted with

additive white Gaussian noise, PSNR=25.91dB. In computing the feature map, the noisy

image in Figure 53(a) was first processed by low pass filtering (3 dB bandwidth of 0.8π).

In addition, 2γ = was used, which has the effect of de-emphasizing the influence of weak

features (produced predominantly by the remaining noise in the image). Figure 53b)

shows the reconstructed image using LS fit, for which 4042N = mesh nodes were used,

PSNR=28.69dB; similarly, Figure 53(c) shows the reconstructed image when N = 2688

mesh nodes were used, PSNR=28.08dB. In addition, we also show in Figure 53(d) the

reconstructed image when the mesh structure Figure 10(c) (from the original image) was

used, PSNR=30.10dB.

146

From the above results it is interesting to note that the mesh representation has

effectively removed the noise in the image. For comparison, we also show in Figure 53(e)

and (f) the images obtained using adaptive Wiener filtering and Median filtering of the

noisy image, respectively, with the PSNR=29.14dB and 27.61dB, respectively. It is noted

that most of the high-frequency features (such as edges and textures) are better preserved

in the mesh representation in Figure 53(b) and (c), even though the adaptive Wiener

filtering produces a result with a higher PSNR.

147

(a) (b)

(c) (d)

(e) (f)

Figure 53. (a) A 128 128× section of noisy Lena image; (b) mesh representation withN=4042 mesh nodes, PSNR=28.69dB; (c) mesh representation with N=2688 meshnodes, PSNR=28.08dB, (d) mesh representation using the mesh in Figure 10(c),PSNR=30.10dB; (e) image obtained using Wiener filtering, PSNR= 29.14dB (f) imageobtained using Median filtering, PSNR= 27.61dB.

148

6.2. Geometric Watermarking

Watermarking involves embedding a packet of additional digital data directly into the

image. The watermarked data can contain: copy or usage rules of the content, owner,

distributor etc.

Main requirements for watermarking are: the presence of the watermark has no effect

on the image quality and the watermark data should be detectable in image witch has

been subjected to a wide variety of distortions introduced by digital distribution (e.g.

compression algorithms) and any image manipulation (rotation, translation, distortion).

While the compression does not affect the watermark significantly a small geometric

distortion, such as rotation, scaling, translation, shearing, random bending or change of

aspect ratio can defeat most of the existing watermarking schemes.

All geometric attacks are challenging problems but random bending is probably the

most difficult to handle among all geometric attacks. In this section, we present a

watermarking scheme based on a new deformable mesh model to combat such attacks.

The distortion is corrected using the distortion field (DF) estimated by minimizing the

matching error between the meshes of the original and attacked image. A CDMA

watermarking method is used for testing the proposed method, which embeds a multi-bit

signature in the DCT domain and uses mesh model correction to achieve robustness.

Experiments show that the proposed scheme can survive wide range of random bending

attacks.


Geometric attacks are still one of the most difficult problems in watermarking. Such

attacks destroy the synchronization of the watermark signals embedded in cover images

by introducing global and local changes to the image coordinates. To extract the

149

watermark message correctly, resynchronization is usually a necessary pre-processing

step.

Many approaches have been reported to combat the global geometric attacks, such as

Fourier-Mellin transform based rotation, scale and translation (RST) invariant

watermarking [129, 130], and template based affine resistant watermarking [131]. Since

global geometric attacks can be modeled by transforms that can be described by a few

parameters, for example, the general affine transforms can be modeled by 6 parameters,

resynchronization is usually possible through parameter estimation or registration.

Random geometric distortions are much more difficult to correct than distortions

caused by RST or affine transforms. The freedom of random attacks can be unlimited

theoretically, or too large for any parameter estimation algorithms. One such attack

applied by StirMark [132] is demonstrated in Figure 54, where Figure 54(a) shows the

rectangular image grid, while Figure 54(b) is the distorted grid by StirMark. Such attacks

may cause almost unnoticeable perceptual distortion in the image, but can defeat most, if

not all, of the existing watermarking algorithms.

In this paper, we present a watermarking scheme which is robust to random geometric

distortions. According to this scheme resynchronization is achieved through a new

deformable mesh model for distortion correction. A similar watermarking scheme was

proposed independently in [133]. However, the scheme in [133] estimates the distortion

field (DF) using the objective function in [6]. In this paper we propose a new objective

function to estimate the DF. This objective function consists of two terms. The first term

captures the matching error between the original and the attacked watermarked image,

and the second term captures the regularity of the DF. This new objective function forces

150

the smoothness of the DF, instead of mesh regularity, so that it can capture effectively the

distortion in the attacked image. The estimated DF is then used for distortion

compensation. We apply a CDMA based multi-bit watermarking scheme [134] for the

embedder, by virtue of its property of high robustness to common signal processing

operations. We present numerical experiments that demonstrate the advantages of DF

based correction of geometric distortions for watermarking over a range of watermark

strengths and geometric distortions. Our numerical experiments also demonstrate that

proposed DF estimation approach yields better results than the approach used in [133].

(a) (b)

Figure 54. (a) Rectangular grid, and (b) grid after attack by random bending.

6.2.2. Mesh Model Based Correction.

Mesh model of the Distortion field.

Let ( )d p denote the distortion vector at location p in the image domain D . Let

( )reff p and ( )argtf p denote, respectively, the image functions of a reference frame and a

distorted frame, also called the target frame. If the distortion is unique (1 to 1) mapping, a

151

perfect match can be achieved, e.g. ( ) ( )( )targ reff f= +p p d p . In deformation estimation

the goal is to estimate ( )d p so that the best match between the frames is achieved.

Next, we introduce the mesh modeling of the DF. The image domain D is partitioned

into M non-overlapping mesh elements, denoted by mD , 1,2, ,m M= K . The DF is then

derived from the inter-frame displacements of the associated mesh nodes. Over a

particular element mD , the DF can be described as:

( ) ( )1

N

n nn

ϕ=

= ∑d p p d (6.1)

where nd and ( )nϕ p are the displacement vector and interpolation basis function

associated with node n, respectively, and N is the total number of mesh nodes. Note that

the support of each basis function ( )nϕ p is limited only to those reference frame

elements mD associated with node n .

Let the nodal positions in the mesh structure of the reference frame be denoted

by 1 2[ , , ]Nref ref ref ref=P p p pL , and similarly by 1 2

arg arg arg arg[ , , ]Nt t t t=P p p pL for the nodal

positions in the target frame. Then the nodal values of DF mesh model are given by:

arg , 1,2, ,n nn t ref n N= − =d p p L (6.2)

In Figure 55 we show an example of a distorted mesh element from a reference to a target

frame.

Deformation of the mesh

In practice, the nodal vectors nd in the deformation mesh model in (6.1) are

unknown, and must be determined from the observed data. To estimate the distortion

between image frames, a natural approach is to displace the mesh nodes so that the

152

corresponding image intensity function over mesh elements in the two frames achieve the

best match in terms of their image values.

Figure 55. Inter frame displacement estimation using deformable mesh model.

As a matching criterion the following objective function is used:

( )( ) ( )( )2

arg1

12

m

M

t ref dm D

J f f d Eβ=

= + − +

∑ ∫ p d p p p (6.3)

where the first term is the matching error accumulated over all M mesh elements between

the two frames [6]. The second term dE is a measure of DF regularity which we define

as:

2

1

12

N

d n nn

E=

= −∑ d d (6.4)

where N is the total number of mesh nodes in the image, and nd is the average

displacement of the immediate neighboring mesh nodes connected to node n. The

parameter β in (6.3) controls the trade-off between mesh matching accuracy and the DF

153

smoothness. This penalty term is different from the one in [6] which was used in [133].

The idea of our penalty term is to enforce a smooth DF rather than mesh regularity

(evenly spaced nodes). In this application a regular mesh is used for the reference frame.

Thus, for uniform mesh model used in this work, the effect of this new penalty term is

more pronounced towards the image boundaries.

The values of the nodal vectors nd are then found numerically by minimizing the

objective function in (6.3) with the gradient descent algorithm. In this paper the

optimization algorithm in [6] was modified by applying the line search using quadratic

interpolation along the gradient direction [135].

Mesh generation

In our application the DF was independent of the image content and varied smoothly

across the image domain. Thus the mesh structure generation was independent of the

image content. Specifically, mesh nodes were placed regularly 64 pixels apart in each

direction. In addition, to avoid boundary effects, the image was extended by 64 pixels in

each direction. The values of those pixels were set to the mean of the whole image.

Deformation compensation

After obtaining the DF estimate, the deformation compensation (DC) was performed.

This can be represented by the following equation:

( ) ( )( )argDC tf f= +p p d p (6.5)

154

6.2.3. MESH Model Based Watermarking Scheme.

CDMA Watermark

Suppose the watermark message is a binary sequence denoted by

{ }| {0,1} , =1,2, ,i im b b i L= ∈ L where L is the length of the watermark sequence. Usually

we use

{ }| 1 2 , 1i i im b b b i L′ ′ ′= = − = L , So m′ is a binary polar sequence of { }1, 1− . The

pseudo-random noise pattern is defined as { }, ( ) | 1, ,i jP p k k L= = L , where , ( )i jp k is 2-D

pseudo-random binary sequence of { }1, 1− with zero mean, generated using the key as

the seed. The size of , ( )i jp k is the same as cover image.

The CDMA watermark is descried as:

( ), ,1

L

i j i j kk

w p k b=

′= ∑ . (6.6)

The CDMA watermark can be embedded into a cover image additively by

, , ,i j i j i jv wν λ= + ⋅% (6.7)

where

,i jv : pixel value of cover image at location (i, j),

,i jν% : corresponding pixel values of the watermarked the image at (i, j),

λ : a positive coefficient that defines the watermark strength.

A simple detection can be performed using a correlation detector. The correlation is

calculated as:

, , , , , ,, , ,

( ) ( ) ( )l i j i j i j i j i j i ji j i j i j

C p l p l v p l wν λ= = +∑ ∑ ∑%

155

2, , ,

1 , ,

( ) ( ) | ( ) |L

k i j i j l i jk i j i j

b p k p l b p lλ λ=

′ ′≈ ≈ ⋅∑∑ ∑ . (6.8)

In the above derivation, , ( )i jp l is a zero mean pseudo-random binary sequence of

{ }1, 1− , and it is independent with ,i jv . Therefore, the term , ,,

( )i j i ji j

p l v⋅∑ is close to

zero. Other terms are also nearly zero when k ≠ l because pseudo-random patterns , ( )i jp k

are uncorrelated with each other. So the watermark message can be estimated as:

( )l lb sign C′ =)

(6.9)

Watermarking scheme

The deformable mesh model based watermarking system is shown in Figure 56. To

demonstrate the idea, we show some results in Figure 57 using an example, where the

original image is shown in (a) and the watermarked is shown in (b). The attacked image

is shown in (c). In (d) the difference between the original and the attacked image is

shown, which illustrates the distortion caused by bending using StirMark 3.1. In Figure

58(a) the regular mesh is overlaid with the mesh generated from the attacked image.

From these two meshes, the distortion field can be computed and is shown in Figure 59.

The distortion compensated image is shown in Figure 58(c). In Figure 58(d) we show the

difference between the original and the distortion compensated image. We can see that

most of the geometric distortion has been compensated for.


A watermark message of 200 bits is embedded into the mid-range DCT coefficients

of the Lena image, shown in Figure 57(a), using the CDMA algorithm detailed in Sect.

3.1. A number of experiments were performed to test the proposed watermarking system.

In all the experiments, 10,000β = was used and the original non-watermarked image was

156

used as a reference for the distortion correction. The bit error rate (BER) is calculated as

the number of incorrectly decoded bits divided by total number of bits in the watermark

message. For comparison, the method proposed in [133] which used a mesh regularity

penalty was also tested and the value of the penalty term parameter was optimized

empirically. In addition, unbending by use of optical flow (OF) [62] approach is

considered as well.

BER vs. bending strength

In this experiment, the watermark strength was fixed at λ=0.5. The test results are

shown in Figure 60. From these results we can see that as the bending strength increases,

the BER also increases. Our mesh model based correction is very effective for modest

amounts of bending. Furthermore, it is clear that the proposed DF estimation approach is

more effective than the approach used in [133] and OF approach.

BER vs. watermarking strength

The bending strength was fixed at 5 in this experiment, and watermarking strength λ

is varied from 0.1 to 1.0. The test results are given in Figure 61. With the proposed

correction nearly error-free decoding can be achieved when the watermarking strength λ

is close to 1.0. Again, the proposed approach achieves the best performance.

6.2.5. Discussion.

In this section, a watermarking approach robust to geometric attacks was presented.

This approach is based on a deformable mesh model. Although only random bending

attacks were considered in our experiments, the proposed system can also be applied to

combat other geometric attacks.

157

Watermarkembedding

Watermarkmessage

PrivateKey

Watermarkextraction

Mesh modelbased correction

possiblegeometric attacks

Watermarkmessage

OriginalImage

Figure 56. Mesh model based watermarking system.

158

(a) (b)

(c) (d)

Figure 57. Images to demonstrate the watermarking process (a) original image (b)watermarked image with PSNR=38.4dB (c) attacked watermarked image; (d)difference between (b) and (c).

159

(a) (b)

(c) (d)

Figure 58. Images to demonstrate the watermarking process cont.; (a) regular mesh andmesh generated from Figure 57(c); (b) meshes of (a) overlaid on the Lena image; (c)deformation compensated watermarked image; (d) difference between Figure 57(a)and (c).

160

Figure 59. The estimated distortion field.

161

00.050.1

0.150.2

0.250.3

0.350.4

0.450.5

1 2 3 4 5 6 7StirMark bending strength

Bit

erro

r rat

e No unbending Unbending as in [6] ProposedOF unbending

Figure 60. BER vs. random bending strength.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.1 0.3 0.5 0.7 0.9CDMA Signal strength

Bit

erro

r rat

e No unbending Unbending as in [6] ProposedOF unbending

Figure 61. BER vs. watermark strength.

162

CHAPTER VII

7. CONCLUSION

In this thesis, we propose new methods for image, and image sequence processing,

with focus on nuclear medicine imaging modalities. The new method is based on content-

adaptive mesh modeling (CAMM).

In Chapter 2, we considered CAMM for image representation. We addressed a critical

issue in mesh modeling: how to determine the best mesh structure for given images

function. Numerical results demonstrate that the new approach can yield a more compact,

accurate representation of images, when compared with several other methods, at a very

low computational cost.

In Chapters 3 and 4 we propose the use of a CAMM for tomographic image

reconstruction. In Chapter 3, a method is proposed in which the reconstructed image is

modeled by an efficient mesh representation, as described in Chapter 2, and then

reconstructed by estimation of the model parameters (nodal values) from the measured

data. In Chapter 4, we extend CAMM reconstruction to dual-modality reconstruction

where we incorporate an anatomical prior (e.g. obtained from CT or MRI).

The proposed methods for tomographic image reconstruction are tested using

simulated data images. Our results indicate that, among the methods tested, the proposed

approach achieves the best performance in terms of image quality and computation time.

In Chapter 5, we present a new 4D reconstruction and post-processing approach for

reducing noise in gated SPECT perfusion images while preserving accurate cardiac

motion. The method is based on motion-compensated temporal smoothing using a

deformable content-adaptive mesh. The CAMM, described in Chapter 2, for initial mesh

163

generation was used. This mesh is then deformed to track cardiac motion, and smoothing

is performed along motion trajectories through the space-time coordinate system. Our

results show that the proposed method is very promising.

In addition, in Chapter 5, a new method for decomposing the image sequence into a

set of distinct temporal basis functions is presented. The proposed method, i.e. the similar

component analysis, incorporates a data model in which class labeling, and accordingly

the temporal basis functions estimation, does not depend on the signal amplitude but

rather on distinct directions in the M dimensional space. Promising preliminary results

are shown.

In Chapter 6, we present extensions in use of CAMM. Specifically we propose a de-

noising procedure where the dimensionality reductions introduced by CAMM are

utilized. Further in the same chapter, an extension of deformable mesh modeling is

proposed for obtaining a robust digital watermarking procedure. The robustness of the

proposed method is reflected as algorithm ability to compensate for random bending

attack.

Practical value of the proposed CAMM based methods for nuclear medicine imaging

modalities, should be tested in clinical environment. Diagnostic performance of medical

doctors (MD) would be a final evidence of benefit from them. This left for further

research.

164

APPENDIX A

MESH MODEL ERROR BOUND FOR SCALAR 2D FUNCTIONREPRESENTATION

165

To facilitate the derivation of the results given in Theorem 1, we need first to review

an interesting result from Lagrange polynomial interpolation. For brevity we simply state

this result below without proof. A formal proof can be found in a standard text covering

function interpolation [77].

Lemma 1. Assume that ( )f x is a real-valued function defined on the interval [ ],a b

which has a continuous 2nd derivative on [ ],a b . Let ˆ ( )f x be the linear function

interpolating ( )f x at a and b , i.e., ˆ ( ) ( )f a f a= and ˆ ( ) ( )f b f b= . Then for each

[ ],x a b∈

( ) ( ) ( )20ˆ8

Mf x f x b a− ≤ − , (A-1)

where [ ]0 ,

max ( )x a b

M f x∈

′′≅ .

Figure 62. Triangle T with vertices , 0,1,2i i =p .

To begin with the proof of Theorem 1, let the vertices of T be denoted by , 0,1,2i i =p

(as illustrated in Figure 62). For notational simplicity, let ( )g p denote the value of a

function ( , )g x y at a point ( ),x y=p . Then, by assumption

166

ˆ ( ) ( ), 0,1,2i if f i= =p p (A-2)

Let 0 1p p denote the side of T formed by 0p and 1p . Clearly, along 0 1p p the function

ˆ ( , )f x y coincides with the one-dimensional (1D) linear interpolation of ( , )f x y at 0p and

1p . Thus, for a point 0 1∈p p p we have from Lemma 1

( ) ( )2

010 1

ˆ8

Mf f− ≤p p p p , (A-3)

where 0 1p p is the length of 0 1p p , and 01M is the largest magnitude of second derivative

of ( , )f x y along 0 1p p . By assumption, 0 1 h≤p p and 01 2M M≤ . Thus

( ) ( ) 22ˆ8

Mf f h− ≤p p . (A-4)

Similarly, also it holds for any points p along the other two sides of T .

Next, consider a point p inside T . Without loss of generality, consider a line through p

and intersecting 0 1p p at *1p and 0 2p p at *

2p (as illustrated in Figure 62). Let ( , )f x y be

the 1-D linear interpolation of ( , )f x y along * *1 2p p at *

1p and *2p , i.e.,

* *1 2( ) (1 ) ( ) ( )f f fα α= − +p p p (A-5)

where * * *1 1 2α = p p p p Since * *

1 2∈p p p we have from Lemma 1

( ) ( )2

* * 22 21 28 8

M Mf f h− ≤ ≤p p p p . (A-6)

Note that ˆ ( , )f x y is also linear along * *1 2p p . Thus,

* *1 2

ˆ ˆ ˆ( ) (1 ) ( ) ( )f f fα α= − +p p p . (A-7)

Furthermore,

( ) ( ) ( ) ( ) ( ) ( )ˆ ˆf f f f f f− ≤ − + −p p p p p p . (A-8)

167

Therefore from (A-5) and (A-7)

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )* * * *1 1 2 2

ˆ ˆ ˆ(1 )f f f f f f f fα α− ≤ − + − − + −p p p p p p p p . (A-9)

Observe that *1 0 1∈p p p and *

2 0 2∈p p p , from (A-4) and (A-6) we obtain

( ) ( ) 2 2 2 22 2 2 2ˆ (1 )8 8 8 4

M M M Mf f h h h hα α− ≤ + − + =p p (A-10)

Theorem 1 is proven.

Derivation of error bound in more general form is given in [80].

168

APPENDIX B

INTERPOLATION ERROR AND ITS GRADIENT WITHRESPECT TO NODAL POSITION

169

Here, the goal is to derive gradient of mesh interpolation error to be used in nodal

position optimization procedure. This derivation is more detailed extension of the one

presented in [6, 76, 89].

The basic idea is to establish a method for cost function gradient calculation with

respect to nodal position by mapping to the master element where a numerical integration

can be done.

Let f pa f denote the image function defined over a domain D , with p = x y,a f. In a

mesh model, the domain D is partitioned into a number, say M , of non-overlapping

mesh elements denoted by D m Mm , ,2, ,= 1 L . Then the image function is approximated

as:

( ) ( ) ( )1

ˆN

n nn

f fϕ=

= ∑p p p , (B-1)

where pn is the nth mesh node defined by pn n nx y= ,b g , ϕ n pa f is the interpolation basis

function associated with pn , and N is the total number of used mesh nodes. Note that the

support of each basis function ϕ n pa f is limited only to those elements Dm attached to the

node n. Note that 1

M

mm

D D=

=U ,1

M

mm

D=

= ∅I .

Within each mesh element (for mD∈p ), the function is represented as:

( ) ( )( ) ( ),,ˆ ,m k mn m k

k K

f f Dφ∈

= ∈∑p p p p , (B-2)

where ( ),m kφ p is the basis function contributing from node k to element mD , K is a set

of nodes that are the vertices of mD and ( , )n m k is a global indexing function which

identifies the nth node to be the kth node of the mD mesh element. In our experiments, we

170

chose the mesh elements to be triangular, therefore cardinal number # 3K = , and we

chose the basis functions to be linear. In future we will use a shorter notation for

( )( ),n m kf p , namely ( ),n m kf .

Now we can define function to be minimized:

( ) ( )( )2

1

1 ˆ2

m

M

mm D

E f f d=

= −

∑ ∫ p p p , (B-3)

where $f pa f is the linear interpolation of f pa f at the vertices of mesh representation.

Or equivalently:

( ) ( ) ( )2

,,1

12

m

M

m m kn m km k KD

E f f dφ= ∈

= −

∑ ∑∫ p p p . (B-4)

Since the evaluation of this integral in its original from is complicated due to the

element irregularity in shape a mapping to the so-called master element is performed. The

main idea is presented in the next figure.

Figure 63. Mapping from arbitrary element mD to the master element D% .

171

Now we can rewrite mE as:

( ) ( ) ( )( ) ( )2

,1

12

M

m k m mn m km k KD

E f f w J dφ= ∈

= −

∑ ∑∫ u u u u%

% , (B-5)

here ( ) ( ) ( )

3

,1

[ , ]m m m m k n m kk

x y w φ=

= = = ∑p u u p% represents the mapping between the master

element D% and mD . Here ( ),s t=u denotes the master element coordinates and

( )1 sφ =u% , ( )2 tφ =u% , ( )3 1 s tφ = − −u% are the interpolation functions over the master

element.

Finally, ( )mJ u , Jacobian of the transform, is defined as:

( ) det

m m

mm m

x ys sJ

x yt t

∂ ∂ ∂ ∂= ∂ ∂ ∂ ∂

u , (B-6)

or equivalently:

( ) ( )( )

( )( ), ,

, ,k km n m k n m k

k K k K

s t s tJ x y

s tφ φ

∈ ∈

∂ ∂= + ∂ ∂ ∑ ∑u

% %

( )( )

( )( ), ,

, ,k kn m k n m k

k K k K

s t s tx y

t sφ φ

∈ ∈

∂ ∂− ∂ ∂ ∑ ∑

% %(B-7)

Note that all the nodes should be ordered in same ordering direction e.g. contra

clockwise. In that case the sign of ( )mJ u is always positive. Change of the ( )mJ u sign

can be an indication of mesh structure irregularity.

Now we can derive an interpolation error derivative as:

( ) ( )2

1

12

Mm

m mmn n D

E e J d=

∂ ∂= ∂ ∂

∑ ∫ u u up p %

% , (B-8)

172

where ( ) ( ) ( ) ( )( ),m k mn m kk K

e f f wφ∈

= −∑u u u%% .

This can be further expanded as:

( ) ( ) ( ) ( ) ( )212

n

mmm m m m

m Mn n nD

JE J e e e d∈

∂ ∂ ∂= + ∂ ∂ ∂ ∑ ∫

uu u u u u

p p p%

% % % , (B-9)

where nM represents the set of elements, mD , attached to the nth node.

Let us examining each term separately:

( ) ( ) ( ) ( ) ( )( , )m k n m mn n n

e f f wφ∂ ∂ ∂ ∂ ∂= −

∂ ∂ ∂ ∂ ∂pu u p p u

p p p p p%% , (B-10)

( ) ( ) ( ),m k m nn

w φ∂=

∂u u

p% , (B-11)

( )nn

δ∂= −

∂p p pp

. (B-12)

where { }( , ) 1,2,3k n m ∈ identifies the local index in D% element of the nth node.

Note that 0 0 0 0( , ( , ))n m k m n n= if the element mD contains 0n node.

Now for ( )m

n

J∂∂

up

one can derive:

( )( )

( ) ( ) ( ) ( )( ),

,

, , , ,m k l k ln m k

l Kn m k

J s t s t s t s ty

x s t t sφ φ φ φ

∈

∂ ∂ ∂ ∂ ∂= − ∂ ∂ ∂ ∂ ∂ ∑

u % % % %(B-13)

( )( )

( ) ( ) ( ) ( )( ),

,

, , , ,m k l k ln m k

l Kn m k

J s t s t s t s tx

y s t t sφ φ φ φ

∈

∂ ∂ ∂ ∂ ∂= − − ∂ ∂ ∂ ∂ ∂ ∑

u % % % %, (B-14)

173

the values of which can be found in the next table:

∂∂

∂∂

−∂∂

∂∂

~ , ~ , ~ , ~ ,φ φ φ φk l k ls ts

s tt

s tt

s ts

a f a f a f a f k=1 k=2 k=3

l=1 0 -1 1

l=2 1 0 -1

l=3 -1 1 0

Finally we have:

( ) ( ) ( ) ( ) ( ) ( )( )

,n n m

mm m k m nD

m Mn w

f fE J e φ∈

∂ ∂∂ = − + ∂ ∂ ∂ ∑ ∫

p u

p pu u u

p p p%%%

( ) ( )212

mm

n

Je d

∂ + ∂

uu u

p% (B-15)

where ( )n

f∂∂

p

pp

denotes the value of ( )f∂∂

pp

evaluated at location np , ( )( )mw

f∂∂

u

pp

denotes

the set of values evaluated at the set of points u mapped by ( )mw u .

In order to maintain the mesh regularity the deformation energy can be added in form

of:

2

1

12

N

d nn

E=

= ∑ t , (B-16)

where, n nll∈ℑ

= ∑t g , nl n l= −g p p ,

with gradient defined as:

dn l

ln

E∈ℑ

∂= −

∂ ∑ t tp

, (B-17)

where ℑ represents a set of nodes neighboring the nth node .

174

APPENDIX C

MESH MODEL ERROR BOUND FORSCALAR 3D FUNCTION REPRESENTATION

175

To begin with the proof of Theorem 2, let the vertices of T be denoted by

, 0,1,2,3i i =p (as illustrated in Figure 64). For notational simplicity, let ( )g p denote the

value of a function ( , , )g x y x at a point ( , , )x y z=p . Then, by assumption

ˆ ( ) ( ), 0,1,2,3i if f i= =p p . (C-1)

Figure 64. Tetrahedron T with vertices , 0,1,2,3i i =p .

Now, let 0 1 2p p p denote the triangular surface of T formed by 0p , 1p and 2p . Clearly,

along 0 1 2p p p the function ˆ ( , , )f x y z coincides with the two-dimensional (2D) linear

interpolation of ( , , )f x y z over 0 1 2p p p from its vertices 0p , 1p and 2p . Thus, for a point

0 1 2∈p p p p we have, from Theorem 1,

( ) ( ) 2012012

ˆ4

Mf f h− ≤p p , (C-2)

176

where 012h 0 1p p is the length of the longest side of 0 1 2p p p , and 012M is the largest

magnitude of the second derivative of ( , , )f x y z along any direction within the plane

defined by 0 1 2p p p . By assumption, 012h h≤ and 012 2M M≤ . Thus

( ) ( ) 22ˆ4

Mf f h− ≤p p . (C-3)

Similarly, also holds for any points p along the other two sides of T .

Next, consider a point p inside T . Without loss of generality, consider a line through

both 3p and p , and intersecting triangular surface 0 1 2p p p at *p (as illustrated in Figure

64). Let ( , , )f x y z be the 1-D linear interpolation of ( , , )f x y z along *3p p at 3p and *p .

That is,

*3( ) (1 ) ( ) ( )f f fα α= − +p p p (C-4)

where *3 3α = p p p p . Since *

3∈p p p we have from Lemma 1 (Appendix A)

( ) ( )2

* 22 238 8

M Mf f h− ≤ ≤p p p p . (C-5)

Note that ˆ ( , , )f x y z is also linear along *3p p . Thus,

* *3 3

ˆ ˆ ˆ ˆ( ) (1 ) ( ) ( ) (1 ) ( ) ( )f f f f fα α α α= − + = − +p p p p p . (C-6)

Furthermore,

( ) ( ) ( ) ( ) ( ) ( )ˆ ˆf f f f f f− ≤ − + −p p p p p p . (C-7)

Therefore from (C-4) and (C-6)

( ) ( ) ( ) ( ) ( ) ( )* *ˆ ˆf f f f f fα− ≤ − + −p p p p p p . (C-8)

Observe that *0 1 2∈p p p p and *

2 0 2∈p p p , and from (C-3) and (C-5) we obtain

177

( ) ( ) 2 2 22 2 23ˆ8 4 8

M M Mf f h h hα− ≤ + =p p (C-9)

From (C-9) we can obtain:

( ) ( )1.5

1.5323ˆ

8Mf f h − ≤

p p . (C-10)

Now the error bound has a volumetric term 3h which is inversely proportional to the

density of mesh nodes.

Theorem 2 is proven.

178

APPENDIX D

MODIFICATION ON ERROR DIFFUSION KERNEL

179

Here we argue a new error diffusion kernel that “pushes” more error in the direction

of the larger 2nd derivative of a function to be represented by mesh modeling. The use of

this error diffusion kernel is shown to be critical in 3D-mesh modeling. This knowledge

was developed after the 2D mesh modeling evaluation was done, thus it is not included in

2D-mesh modeling.

First we will show that the ideal sampling grid for 1D quadratic function is a uniform,

and then we will argue that for 2D separable quadratic function near ideal sampling

should have different densities in each axial direction.

Let consider 2L norm of the interpolation error of a 1D function on interval 1[ , ]Nx x

containing N samples:

( )2

ˆ( ) ( )N

o

x

x

e f x f x dx= −∫ (D-1)

By assuming linear interpolation for ˆ ( )f x we have:

12

12 1

10 1 1

( , ) ( ) ( ) ( )i

i

xNi i

i ii i i i ix

x x x xe c d f x f x f x dxx x x x

+−+

+= + +

− −= − + − − ∑ ∫ (D-2)

Now, assuming a quadratic form of 2( )f x Ax Bx C= + + , with simple algebraic

calculation one can obtain:

( )2 1

521

0

( , )30

N

i ii

Ae c d x x−

+=

= −∑ (D-3)

Note that the error depends on A , but not on B and C .

180

By finding the optimal sample distance, 1i i id x x+= − , with a constraint

11

N

N ii

x x d=

− = ∑ , one can derive that the optimal solution for quadratic function is an equal

distance sampling strategy.

In addition, from Eq. (A-10) one can observe that the error is proportional to the

second derivative of the function to be represented, 2

2

( , )f x yx

∂∂

.

Now, if we assume a 2D function of a form 2 2( )f x Ax By Cx Dy F= + + + + , in

order to achieve minimum 2L norm of the interpolation error, the samples, in x

direction, should be placed proportionally with 2

2

( , )f x yx

∂∂

and in y direction

proportionally with 2

2

( , )f x yy

∂∂

.

Note that the proposed 2D-mesh modeling, Sect. 2.1, it will achieve uniform

sampling with same spatial density in each axial direction for quadratic function.

Therefore we can modify the 2D error diffusion kernel, by substituting 1 2 3 4, , ,w w w w

with 2

2

( , )f x yx

∂∂

, 2 ( , )f x y

xy∂

∂,

2

2

( , )f x yy

∂∂

, and 2 ( , )f x y

yx∂

∂, respectively.

Basically, this new “error diffusion” kernel pushes the error in the direction of the

edge. The use of this modified kernel is critical in 3D-mesh modeling where the

dispersion of the edge, by the old kernel, is more prone then in 2D.

In 3D case, Figure 15, we used a modified kernel to i i iw G h= , where:

[ ]0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 10 0 11 0 12 0 13, , , , , , , , , , , ,f f f f f f f f f f f f f→ → → → → → → → → → → → →′′ ′′ ′′ ′′ ′′ ′′ ′′ ′′ ′′ ′′ ′′ ′′ ′′=G , (D-5)

181

where 0 1f →′′ denotes the second directional derivative of f in the direction of the vector

connecting the current voxel with the voxel denoted by 1 in Figure 15, and

( )( )( ) ( )

( )( )

2 2 2 2 2 21 1 2 2 2 1 2 1 1 1 2 1 2, , , , , , , , , , , ,4 16 8 16 8 4 8 16 8 162 2 8 2 1 4 2 1 8 2 1

+ ++ = + + + +

h . (D-6)

The choice of weightings ih was made to achieve uniform error diffusion in each axis

direction.

182

APPENDIX E

NUMERICAL EVALUATION OVER MESH ELEMENTS

183

In finite element methods a master element is often used to circumvent the difficulty

associated with computation over mesh elements with different shapes. A master element

typically has a simple geometric shape, over which function interpolation can be easily

computed. The computation over an arbitrary element is carried out by first mapping it

into the master element, over which the computation is greatly simplified.

To demonstrate the idea, let us consider the right triangle with vertices at

1 2(1,0), (0,1),= =u u and 3 (0,0)=u . For convenience, denote this element by D% . At a

point ( , )s t D= ∈u % , a function can be linearly interpolated from the three vertices of D%

as follows:

( ) ( ) ( )3

1k k

k

f f ϕ=

= ∑u u u , (E-1)

where ( )1 sϕ =u , ( )2 tϕ =u , and ( )3 1 s tϕ = − −u .

Now consider an arbitrary triangular element mD the vertices of which are denoted

by : ( , )i i ix yp , 1,2,3i = . The element mD can be mapped onto D% as follows:

2 3 3 2 2 3 3 2

3 1 1 3 3 1 1 3

1

m

x y x y y y x xs xx y x y y y x xt yJ

− − − = + − − −

, (E-2)

where ( , ) mx y D= ∈p , ( , )s t D= ∈u % , and 1 3 1 3

2 3 2 3

detm

x x y yJ

x x y y− −

= − − .

Thus, the interpolation at a point mD∈p over element mD can be conveniently

computed by first mapping p into a point D∈u % through (E-2), followed by interpolation

using (E-1).

Next, consider the problem of integration of a function ( , )r x y over mD , i.e.,

184

( , )mD

a r x y dxdy= ∫ . (E-3)

By change of variables through the mapping in (E-2), this integration can be

computed over the master element D% instead as

'( , )m Da J r s t dsdt= ∫ % , (E-4)

where '( , )r s t denotes the function ( , )r x y in the mapped coordinates.

Recall the elements ,i na of the imaging matrix A in (32). They can be computed by

using the master element D% as in (E-4). More importantly, when the analytical form of

the response function ( )ih x is available, we can pre-calculate the elements ,i na in a

closed analytical form. This, of course, can greatly reduce the overhead associated with

computing the matrix A .

185

APPENDIX F

MATCHING ERROR GRADIENT CALCULATION

186

Assuming a mesh model representation as described in Sect. 5.2, deformation vector

field (DVF) can be obtained for any point p within the mesh element mD as follows:

( ) ( ) ( ), , ,m k mn m kk K

Dφ∈

= ∈∑d p p d p . (F-1)

where ( ),m kφ p is the basis function contributing from the node n to element mD , and K

set of nodes that are the vertices of mD . In our experiments, we chose the mesh elements

to be triangular, therefore # 3K = , and we chose the basis functions to be linear.

Deformation of the mesh from the current frame to the target frame is achieved by

minimizing the matching criterion of the following form to find the target-frame nodal

displacement nd :

( )( ) ( ){ }2

1 arg1 m

M

t refDm

E f f d=

= + − ∑ ∫ p d p p p , (F-2)

in which the mesh representation ( )d p of the DVF is defined as in (F-1), and

( )reff p , arg ( )tf p represent the reference and target frame, respectively. The matching

error is small when the target-frame nodal positions provide a good match between the

target frame and the motion-compensated reference frame.

Now let us define the mapping from the master element to the reference and target

frame.

( ) ( ) ( ), , ,ref m k ref n m kk K

w φ∈

= ∑u u p% (F-3)

( ) ( ) ( ) ( ) ( ) ( )( )arg, arg, , , , ,t m k kt n m k ref n m k n m kk K k K

w φ φ∈ ∈

= = +∑ ∑u u p u p d% % (F-4)

( ) ( ) ( ), arg,ref m k tn m kk K

w φ∈

= + =∑u u d p% (F-5)

This can be seen in Error! Reference source not found..

187

Figure 65. Master element mapping

Now the matching error can be rewritten as:

( ) ( )2arg,

1

12

M

m t mm D

J e J d=

=

∑ ∫ u u u% (F-6)

where:

( ) ( ) ( ) ( ) ( )( )arg , ,,m t ref m k ref ref mn m kk K

e f w f wφ∈

= + −

∑u u u d u%% (F-7)

The matching criterion gradient can be found as:

( ) ( ) ( ) ( ) ( )2 arg,arg,

arg,

12

n

t mm mt m m m m

m Mt n n n nD

JE E J e e e d∈

∂ ∂ ∂ ∂= = + ∂ ∂ ∂ ∂

∑ ∫u

u u u u up d d d%

% % % (F-8)

Examining each term separately:

( ) ( ) ( )argm tn n

e f w∂ ∂ ∂=

∂ ∂ ∂u p u

d p d% , (F-9)

( ) ( ) ( ) ( ) ( )arg , ( , ),t ref m k k n mn m kk Kn n

w w φ φ∈

∂ ∂ = + = ∂ ∂ ∑u u u d u

p d% % , (F-10)

188

( )nn

δ∂= −

∂p p pp

. (F-11)

Finally we have:

( ) ( ) ( ) ( ) ( )( )

( ) ( )

arg,

2arg arg,arg, ,

arg, arg,

12

n t m

t t mt m m mk m n

m Mt n t nD w

f JJ J e e dφ∈

∂ ∂∂ = + ∂ ∂ ∂

∑ ∫

u

p uu u u u u

p p p%

%% % ,(F-12)

where ( )( )argwt

f∂∂

u

pp

denotes the set of values evaluated at mapped, by ( )mw u , set of

points u and ( )arg,

arg,

t m

t n

J∂∂

up

is the same as derived in Appendix B.

189

APPENDIX G

MODIFIED MATCHING CRITERIA GRADIENT CALCULATION

190

By adopting same notation as in Appendix F one can define a matching criterion

gradient as:

( ) ( )2arg,

1arg,

12

Mm m

m t mmt n n n D

E E e J d=

∂ ∂ ∂= = ∂ ∂ ∂

∑ ∫ u u up d d %

% . (G-1)

The modification is in the error term and it is defined as:

( ) ( ) ( ) ( )( )( ) ( )( )arg,

arg , ,,,

t mm t ref m k ref ref mn m k

k K ref m

Je f w f w

Jφ

∈

= + −

∑u

u u u d uu

%% . (G-2)

Now using results from Appendix F it can be easily shown that the gradient can be

written as:

( ) ( ) ( ) ( ) ( )( )

( )( )

arg,

arg arg,arg, ,

arg, ,n t m

t t mmt m m k m n

m Mt n ref mD w

f JE J eJ

φ∈

∂∂ = + ∂ ∂ ∑ ∫

u

p uu u u

p p u%

%%

( )( ) ( )

( ) ( ) ( )arg,

2arg, arg,arg

, arg, arg,

1 12t m

t m t mt mw

ref m t n t n

J Jf e d

J

∂ ∂+ + ∂ ∂ u

u up u u

u p p% . (G-3)

191

APPENDIX H

ML PARAMETER ESTIMATION FOR SIMILARITY CLUSTERING

192

The log-likelihood of the complete data is given by:

L p X P X kin n

in

i

n

N

β θ β θ θ, ; log | ; log ;X − − −

=

= + =∑1 1 1

1c h c hd i c hd i (H-1)

Given that, after observing the data Yn , Xn is the only random variable we obtain:

Q E Li i iθ θ β θ; , ;− −=1 1c h c hX (H-2)

= = − − −FHG

IKJ + =

LNMM

OQPP

− −

==∑∑ p X k A P X kn n

in n

n X

nn

i

n

N

k

Kn| ; log log ;β θ γ θ1 1

11

2 1c h b g c hd iY eY

T

(H-3)

Computing the conditional expectation of the log-likelihood function given the model

parameters is the E-step of the EM optimization.

Now we maximize the expected log-likelihood function (M-Step) with respect to the

model parameters θ = E P A, , .

- Maximization of Q i iθ θ; −1c h with respect to P

∂∂

=∂∂

= − −FHG

IKJ

FHG

IKJ

− −

==∑∑p

Qp

p X k p pk

i i

kn n

ik k

k

K

n

N

θ θ β θ λ; | ; log1 1

11

1c h c h b g , (H-4)

0 11

1

= = −FHG

IKJ

−

=∑ p X k

pn ni

kn

N

| ;β θ λc h , (H-5)

p p X kk n ni

n

N

= = −

=∑1 1

1λβ θ| ;c h, (H-6)

Now define

n p X kk n ni

n

N

= = −

=∑ | ;β θ 1

1c h, (H-7)

p Nkk

K

=∑ = ⇒ =

1

1 λ , (H-8)

and finally

193

P X k nNn

i k= =−;θ 1c h . (H-9)

- Maximization of Q i iθ θ; −1c h with respect to E

∂∂

− − =−

ee eT

k

i ik kQ θ θ λ; 1 1c h c hd i ,

= =∂∂

−∂∂

−−

=∑ p X k An n

i

kn

N

n kk

k k| ;β θ λ1

1

2 1c h c he

Y ee

e eT T (H-10)

2 01

1

A p X kn ni

n

N

n k= − =−

=∑ | ;β θ λc hY e , (H-11)

e e Y YT Tk k n n

i

l

N

n

N

l li

l nWA

p X k p X k= ⇒ = = = =−

==

−∑∑12

1

11

1λ β θ β θ| ; | ;c h c h (H-12)

and finally

e Yk n ni

n

N

nWp X k= = −

=∑1 1

1

| ;β θc h . (H-13)

- Maximization of Q i iθ θ; −1c h with respect to A

The ∂∂

−

AQ

n

i iθ θ; 1c h is analytically intractable, but under the assumption that from

the E-step we have probabilities of each pixel being in the kth class, p X kn ni= −| ;Y θ 1c h ,

and the Gaussian noise model with α σn >> , we can use ML estimate of α n and σ

separately as in (5.31) and (5.32), i.e.,

$ | ;α θNk

n ni

k

K

n knp X k= = −

=∑1 1

1

Y Y ec h , (H-14)

$ | ;σ θ2 1

1

1= = − −−

=∑n

p X k Ek

n ni

k

K

n n kT

n n kY Y Y e Y Y ec h c h c h , (H-15)

to obtain finally

$ $

$An =

ασ

2

22. (H-16)

194

BIBLIOGRAPHY

[1] Cho, Z.-H., Jones, J. P., and Singh, M., Foundations of Medical Imaging.Singapore: John Wiley & Sons Inc., 1992.

[2] Nichols, K. and Depuey, E. G., “Regional and global ventricular function analysiswith SPECT perfusion imaging,” in Nuclear Cardiology: State of the Artand Future Direction, 2nd ed. St. Louis: Mosby, 1999, pp. 137-187.

[3] Aizawa, K. and Huang, T. S., “Model-based image coding: advanced video codingtechniques for very low bit-rate applications,” Proc. of IEEE, vol. 83, pp.259-271, 1995.

[4] Davoine, F., Antonini, M., Chassery, J., and Barlaud, M., “Fractal imagecompression based on Delaunay triangulation and vector quantization,”IEEE Trans. Image Proc., vol. 5, pp. 338-346, 1996.

[5] Benoit-Cattin, H., Joachimsmann, P., A. Planat, S. V., Baskurt, A., and Prost, R.,“Active mesh texture coding based on warping and DCT,” presented atIEEE Int. Conf. Image Proc., Kobe, Japan, 1999.

[6] Wang, Y. and Lee, O., “Active mesh-a feature seeking and tracking imagesequence representation scheme,” IEEE Trans. Image Proc., vol. 3, pp. 610-624, 1994.

[7] Altunbasak, Y. and Tekalp, A. M., “Closed-form connectivity-preserving solutionsfor motion compensation using 2-D meshes,” IEEE Trans. Image Proc., vol.6, pp. 1255-1269, 1997.

[8] Huang, C. L. and Hsu, C. Y., “A new motion compensation method for imagesequence coding using hierarchical grid interpolation,” IEEE Trans. CircuitsSyst. Video Tech., vol. 4, pp. 44-51, 1994.

[9] Gevers, T. and Smeulders, A. W., “Combining region splitting and edge detectionthrough guided Delaunay image subdivision,” presented at IEEE Int. Conf.Comp. Vision, Pattern Recog., Puerto Rico, June 1997.

[10] Garcia, M. A., Vintimilla, B. X., and Sappa, A. D., “Efficient approximation ofgray-scale image through bounded error triangular meshes,” presented atIEEE Int. Conf. Image Proc., Kobe, Japan, 1999.

[11] Terzopoulos, D. and Vasilescu, M., “Sampling and reconstruction with adaptivemeshes,” presented at IEEE Int. Conf. Comp. Vision, Pattern Recog., 1992.

195

[12] Lee, J., Yang, Y., and Wernick, M. N., “A new approach for image-contentadaptive mesh generation,” presented at IEEE Int. Conf. Image Proc.,Vancouver, Sep. 2000.

[13] Marks, R. J., Advanced Topics in Shannon Sampling and Interpolation Theory, IIed: Springer-Verlag, 1993.

[14] Barrett, H. H., The Radon transform and its applications Progress in Optics XXI.New York: Elsevier, 1984.

[15] Huang, S.-C., Mahoney, D. K., and Phelps, M. E., “Quantitation in positronemission tomography: 8. Effects of nonlinear parameter estimation onfunctional images,” J. Comput. Assist. Tomog., vol. 11, pp. 314-325, 1987.

[16] Brankov, J. G., Djordjevic, J., Galatsanos, N. P., and Wernick, M. N.,“Tomographic image reconstruction for systems with partially-known blur,”presented at IEEE Int. Conf. Image Proc., Kobe, Japan, 1999.

[17] Mesarovic, V. Z., Galatsanos, N. P., and Wernick, M. N., “Iterative maximum aposteriori (MAP) restoration from partially-known blur for tomographicreconstruction,” IEEE Int. Conf. Image Proc., vol. 2, pp. 512-515, Oct.1995.

[18] Mesarovic, V. Z., Galatsanos, N. P., and Wernick, M. N., “Sinogram restorationfrom partially-known blur for tomographic reconstruction,” presented atIEEE Nucl. Sci. Symp. Med. Imaging Conf., San Francisco, Oct. 1995.

[19] Deans, S. R., The Radon Transform and some of its Applications. New York:Wiley, 1983.

[20] Stark, H., Woods, J. W., Paul, I., and Hingorani, R., “Direct Fourier reconstructionin computer tomography,” IEEE Trans. Acoust., Speech, Signal Proc., vol.29, pp. 237-244, 1981.

[21] Lange, K. and Carson, R. E., “EM reconstruction for emission and transmissiontomography,” J. Comput. Assisted Tomogrphy, vol. 8, pp. 302-316, 1984.

[22] Shepp, L. A. and Vardi, Y., “Maximum likelihood reconstruction for emissiontomography,” IEEE Trans. Med. Imaging, vol. 1, pp. 113-122, 1982.

[23] Rockmore, A. J. and Macovski, A., “A maximum likelihood approach totransmission image reconstruction from projections,” IEEE Trans. Med.Imaging, vol. 14, pp. 146-150, 1977.

196

[24] Dempster, A. P., Laird, N. M., and Rubin, D. B., “Maximum likelihood fromincomplete data via the EM algorithm,” J. Roy. Statist. Sect., vol. 39, pp. 1-38, 1977.

[25] Snyder, D. L., Miller, M. I., Lewis, J., Thomas, J., and Politte, D. G., “Noise andedge artifacts in maximum-likelihood reconstructions for emissiontomography,” IEEE Trans. Med. Imaging, vol. 6, pp. 223-238, 1987.

[26] Bouman, C. and Sauer, K., “A generalized Gaussian image model for edge-preserving MAP estimation,” IEEE Tran. Image Proc., vol. 2, pp. 296-310,1993.

[27] Chen, C.-T., Johnson, V. E., Wong, W. H., Hu, X., and Metz, C. E., “Bayesianimage reconstruction in positron emission tomography,” IEEE Trans. Nucl.Sci., vol. 37, pp. 636-641, 1990.

[28] Geman, S. and Geman, D., “Stochastic relaxation, Gibbs distributions, andBayesian restoration of images,” IEEE Trans. Patt. Anal. Mach. Intell., vol.6, pp. 721-741, 1984.

[29] Geman, S. and McClure, D. E., “Bayesian image analysis: An application to singlephoton emission tomography,” presented at Proc. Statist. Comput. Sect.,Washington, DC, 1985.

[30] Green, P. J., “Bayesian reconstructions from emission tomography data using amodified EM algorithm,” IEEE Tran. Med. Imaging, vol. 9, pp. 84-93,1990.

[31] Hanson, K. M., Bayesian and related methods in image reconstruction fromincomplete data Image Recovery. San Diego, CA: Academic Press, 1987.

[32] Hart, H. and Liang, Z., “Bayesian image processing in two dimensions,” IEEETrans. Med. Imaging, vol. 6, pp. 201-208, 1988.

[33] Hebert, T. and Leahy, R., “A generalized EM algorithm for 3-D Bayesianreconstruction from Poisson data using Gibbs priors,” IEEE Trans. Med.Imaging, vol. 8, pp. 194-202, 1989.

[34] Lalush, D. S. and Tsui, B. M., “Simulation evaluation of Gibbs prior distributionsfor use in maximum a posteriori SPECT reconstructions,” IEEE Trans. Med.Imaging, vol. 11, pp. 267-275, 1992.

[35] Lee, S. J., Rangarajan, A., and Gindi, G., “Bayesian image reconstruction inSPECT using higher order mechanical models as priors,” IEEE Trans. Med.Imaging, vol. 14, pp. 669-680, 1995.

197

[36] Levitan, E. and Herman, G. T., “A maximum a posteriori probability expectationmaximization algorithm for image reconstruction in emission tomography,”IEEE Trans. Med. Imaging, vol. 6, pp. 185-192, 1988.

[37] Liang, Z., Jaszczak, R., and Greer, K., “On Bayesian image reconstruction fromprojections: uniform and nonuniform a priori source information,” IEEETrans. Med. Imaging, vol. 8, pp. 227-235, 1989.

[38] Leahy, R. and Yan, X., “Incorporation of anatomical MR data for improvedfunctional imaging with PET,” in Information Processing in Med. Imaging,Colchester, A. C. F. and Hawkes, D. J., Eds. New York: Springer-Verlag,1991, pp. 102-120.

[39] Gindi, G., Lee, M., Rangarajan, A., and Zubal, I. G., “Bayesian reconstruction offunctional images using anatomical information as priors,” IEEE Tran. Med.Imaging, vol. 12, pp. 670 -680, Dec. 1993.

[40] Fessler, J. A., “Penalized weighted least-squares reconstruction for positronemission tomography,” IEEE Trans. Med. Imaging, vol. 13, pp. 290-300,1994.

[41] Llacer, J. and Veklerov, E., “Feasible images and practical stopping rules foriterative algorithms in emission tomography,” IEEE Trans. Med. Imaging,vol. 8, pp. 186-193, 1989.

[42] Frieden, B. R. and Zoltani, C. K., “Maximum bounded entropy: application totomographic reconstruction,” Appl. Opt., vol. 24, pp. 201-207, 1984.

[43] Gull, S. F. and Newton, T. J., “Maximum entropy tomography,” Appl. Opt., vol.25, pp. 156-160, 1986.

[44] Wernecke, S. J. and D'Addario, L. R., “Maximum entropy image reconstruction,”IEEE Tran. Comp., vol. 26, pp. 351-364, 1977.

[45] Kaczmarz, S., “Angenäherte Auflösung von Systemen Linearer Gleichungen,”Bull. Acad. Polon. Sci. Lett., vol. A 35, pp. 355-357, 1937.

[46] Censor, Y., “Finite series-expansion reconstruction methods,” Proc. of the IEEE,vol. 71, pp. 409-419, 1983.

[47] Gordon, R., Bender, R., and Herman, G. T., “Algebraic reconstruction techniques(ART) for three-dimensional electron microscopy and x-ray photography,”J. Theor. Biol., vol. 29, pp. 471-481, 1970.

[48] Hanson, K. M., “POPART - Performance optimized algebraic reconstructiontechniques,” Proc SPIE, vol. 1001, 1988.

198

[49] Herman, G. T., Image Reconstruction from Projections: The Fundamentals ofComputerized Tomography. New York: Academic Press, 1980.

[50] Sezan, M. I. and Stark, H., “Image restoration by the method of convexprojections: Part 2 - Applications and numerical results,” IEEE Trans. Med.Imaging, vol. 1, pp. 95-101, 1982.

[51] Youla, D. C. and Webb, H., “Image restoration by the method of convexprojections: Part 1 - Theory,” IEEE Trans. Med. Imaging, vol. 1, pp. 81-94,1982.

[52] Oskoui-Fard, P. and Stark, H., “Tomographic image reconstruction using thetheory of convex projections,” IEEE Trans. Med. Imaging, vol. 7, pp. 45-58,1988.

[53] Wernick, M. N. and Chen, C.-T., “Superresolved tomography by convexprojections and detector motion,” J. Opt. Soc. Am., vol. 9, pp. 1547-1553,1992.

[54] Hudson, H. M. and Larkin, R. S., “Accelerated image reconstruction using orderedsubset of projection data,” IEEE Trans. Med. Imaging, vol. 13, pp. 601-609,1994.

[55] Byrne, C. L., “Block-iterative methods for image reconstruction from projections,”IEEE Trans. Image Proc., vol. 5, pp. 792-794, 1996.

[56] Lalush, D. S. and Tsui, B. M. W., “Block-iterative techniques for fast 4Dreconstruction using a priori motion models in gated cardiac SPECT,” Phys.Med. Biol., vol. 43, pp. 875-886, 1998.

[57] Bresler, Y., Fessler, J. A., and Macovski, A., “A Bayesian approach toreconstruction form incomplete projections of a multiple object 3Ddomains,” IEEE Trans. Patt. Anal. Mach. Intell., vol. 11, pp. 840-858, 1989.

[58] Cunningham, G. S., Hanson, K. M., and Battle, X. L., “Three dimensionalreconstruction from low-count SPECT data using deformable models,”presented at IEEE Nucl. Sci. Symp. Med. Imaging Conf., 1997.

[59] Jennings, G. R. and Wolf, D. R., “Tomographic reconstruction based on flexiblegeometric models,” presented at IEEE Int. Conf. on Image Proc., 1994.

[60] Wernick, M. N., Infusino, E. J., and Milosevic, M., “Fast spatio-temporal imagereconstruction for dynamic PET,” IEEE Tran. on Med. Imaging, vol. 18, pp.185-195, 1999.

199

[61] Lalush, D. S. and Tsui, B. M. W., “Space-time gibbs priors applied to gatedSPECT myocardial perfusion studies,” presented at Int. Meeting on 3DImage Rec. in Rad. and Nuc. Med., Dordrecht, 1996.

[62] Teklap, A. M., Digital Video Processing. NJ: Prentice-Hall, 1995.

[63] Klein, G. J., Reutter, B. W., and Huesman, R. H., “Non-rigid summing of gatedPET via optical flow,” IEEE Tran. Nucl. Sci., vol. 44, pp. 1509 -1512, Avg.1997.

[64] Vega-Riveros, J. F. and Jabbour, K., “Review of motion analysis techniques,”IEEE Comm. Speech and Vision, vol. 136, pp. 397-404, Dec. 1989.

[65] Barron, J. L., Fleet, D. J., and Beauchemin, S. S., “Performance of optical flowtechniques,” presented at Inter. Jour. of Comp. Vision, 1994.

[66] King, M. A., Schwinger, R. B., Doherty, P. W., and Penney, B. C., “Two-dimenstional filtering of SPECT images using Metz and Wiener filters,” J.Nucl. Medc., vol. 25, pp. 1234-1240, 1984.

[67] Kao, C.-M., YAP, J. T., Mukherjee, J., and Wernick, M. N., “Image reconstructionfor dynamic PET based on low-order approximation and restaration of thesinogram,” IEEE Trans. Med. Imaging, vol. 16, pp. 738-749, 1997.

[68] Narayanan, M. V., King, M. A., Soares, E. J., Byrne, H. L., Pretorius, P. H., andWernick, M. N., “Application of the Karhunen-Loève transform to 4Dreconstruction of cardiac gated SPECT images,” IEEE Tran. Nucl. Sci., vol.46, pp. 1001-1008, Aug. 1999.

[69] Brankov, J. G., Wernick, M. N., Yang, Y., and Narayanan, M. V., “Spatially-adaptive temporal smoothing for reconstruction of dynamic and gated imagesequences,” presented at IEEE Nucl. Sci. Symp. Med. Imaging Conf., Lyon,France, 2000.

[70] Brailean, J. C., Kleihorst, R. P., Efstratiadis, S., Katsaggelos, A. K., and Lagendijk,R. L., “Noise reduction filters for dynamic image sequences: a review,”IEEE Proc., vol. 83, pp. 1272-1292, 1995.

[71] Peter, J., Jaszczak, R. J., and Hutton, B. F., “Fully adaptive temporal regressionsmoothing in gated cardiac SPECT image reconstruction,” IEEE Trans.Nuc. Sci., vol. 48, pp. 16-23, 2001.

[72] Johnson, R. A. and Wichern, D. W., Applied Multivariate Statistical Analysis. NewJersey: Prentice-Hall, Inc., 1992.

200

[73] McLachlan, G. J. and Krishnan, T., The EM Algorithm and Extension: John Wiley& Sons, 1997.

[74] Lee, T.-W., Independent Component Analysis: Theory and Applications. London:Kluwer Academic Publishers, 1998.

[75] Bouman, C. A., Chen, S., and Lowe, M. J., “Clustered component analysis forfMRI signals estimation and classification,” presented at IEEE Int. Conf.Image Proc., 2000.

[76] Wang, Y. and Lee, O., “Use of two-dimensional deformable mesh structures forvideo coding .I. The synthesis problem: mesh-based function approximationand mapping,” IEEE Trans. on Circuits Syst. Video Tech., vol. 6, pp. 636 -646, 1996.

[77] Prenter, P. M., Spines and Variational Methods: John Wiley & Sons, 1975.

[78] Braess, D., Finite Elements: Cambridge University Press, 1997.

[79] Yang, Y., Wernick, M. N., and Brankov, J. G., “A computationally efficientapproach for accurate content-adaptive mesh generation,” , in review.

[80] Waldron, S., “The error in linear interpolation at the vertices of a simplexy,” IsraelInstitute of Technology, Department of Mathematics 1996.

[81] Floyd, R. and Steinberg, L., “An adaptive algorithm for spatial gray scale,” SIDInt. Sym. Digest of Tech. Papers, 1975.

[82] Ulichney, R., Digital Halftoning: The MIT Press, Cambridge Mass., 1987.

[83] Preparata, F. and Shamos, M., Computational Geometry-An Introduction:Springer-Verlag, New York, 1985.

[84] Dudgeon, D. E. and Mersereau, R. M., Multidimensional Digital SignalProcessing: Prentice-Hall, 1984.

[85] Lim, J. S., Two-dimensional Signal and Image Processing: Prentice-Hall, 1990.

[86] Papoulis, A., “A new algorithm in spectral analysis and bandlimitedextrapolation,” IEEE Trans. Circuits Syst., vol. CAS-22, pp. 735-742, Sept.1975.

[87] Cartan, H., Elementary Theory of Analytic Functions of One or Several ComplexVariables: Addison-Wesley Pub. Co., 1963.

201

[88] Chong, E. K. P. and Zak, S. H., An Introduction to Optimization. New York: JohnWiley & Sons, Inc., 1996.

[89] Wang, Y., Lee, O., and Vetro, A., “Use of two-dimensional deformable meshstructures for video coding. II. The analysis problem and a region-basedcoder employing an active mesh representation,” IEEE Trans. Circuits Syst.Video Tech., vol. 6, pp. 647 -659, 1996.

[90] Brankov, J. G., Yang, Y., Narayanan, M. V., and Wernick, M. N., “Content-adaptive mesh modeling for tomographic image reconstruction,” presentedat Int. Meeting on 3D Image Recon. in Radi. and Nucl. Med., Pacific Grove,California, USA, 2001.

[91] Pretorius, P. H., Xia, W., King, M. A., Tsui, B. M. W., Pan, T. S., and Villegas, B.J., “Evaluation of right and left ventricular volume and ejection fractionusing a mathematical cardiac torso phantom for gated pool SPECT,” J. ofNucl. Medi., vol. 38, pp. 1528-1534, 1997.

[92] Zubal, I. G., Harrell, C. R., Smith, E. O., Rattner, Z., Gindi, G. R., and Hoffer, P.B., “Computerized 3D-dimensional segmented human anatomy,” Med.Phys., vol. 21, pp. 299-302, 1994.

[93] Demaret, L., Robert, G., Laurent, N., and Buisson, A., “Scalable image codermixing DCT and triangular meshes,” presented at IEEE Inter. Conf. onImage Proc., Vancouver, Canada, Sep. 2000.

[94] Toklu, C., Tekalp, A. M., and Erdem, A. T., “Semi-automatic video objectsegmentation in the presence of occlusion,” IEEE Trans. Circuits andSystems for Video Tech., vol. 10, pp. 624-629, 2000.

[95] Nosratinia, A., “New kernels for fast mesh-based motion estimation,” IEEE Trans.Circuits and Systems for Video Tech., vol. 11, pp. 40-51, 2001.

[96] Hsu, P., Liu, K. J. R., and Chen, T., “A low bit-rate video codec based on two-dimensional mesh motion compensation with adaptive interpolation,” IEEETrans. Circuits and Systems for Video Tech., vol. 11, pp. 111-117, 2001.

[97] Garcia, M. A. and Vintimilla, B. X., “Acceleration of filtering and enhancementoperations through geometric processing of gray-level images,” presented atIEEE Inter. Conf. on Image Processing, Vancouver, Canada, Sep. 2000.

[98] Singh, A., Goldgof, D., and Terzopoulos, D., Deformable Models in MedicalImage Analysis: IEEE Computer Society Press, 1998.

[99] Yang, Y., Wernick, M. N., and Brankov, J. G., “A Fast algorithm for accuratecontent-adaptive mesh generation,” IEEE Int. Conf. Image Proc., 2001.

202

[100] Brankov, J. G., Yang, Y., and Wernick, M. N., “Tomographic imagereconstruction using content-adaptive mesh modeling,” presented at IEEEInter. Conf. on Image Proc., Thessaloniki, Greece, 2001.

[101] Brankov, J. G., Yang, Y., Wernick, M. N., and Narayanan, M. V., “4D processingof gated SPECT images using deformable mesh modeling,” presented at Int.Meeting on 3D Image Rec. in Rad. and Nuc. Med., Pacific Grove,California, 2001.

[102] Leahy, R. and Qi, J., “Statistical approaches in quantitative positron emissiontomography,” Statistics and Computing, vol. 10, pp. 147-165, 2000.

[103] Mohammad-Djafari, Sauer, K., Khayi, Y., and Cano, E., “Reconstruction of theshape of a compact object from few projections,” presented at IEEE Int.Conf. on Image Processing, Santa Barbara, CA, 1997.

[104] Hanson, K. M., Cunningham, G. S., Jennings, G. R., and Wolf, D. R.,“Tomographic reconstruction based on flexible geometric models,”presented at IEEE Int. Conf. on Image Proc., 1994.

[105] Rossi, D. J. and Wilsky, A. S., “Reconstruction from projections based ondetection and estimation of objects,” IEEE Trans. Acoust. Speech, SignalProc., vol. 32, pp. 886-906, 1984.

[106] Shmueli, K., Brody, W. R., and Macovski, A., “Estimation of blood vesselsboundaries in X-ray images,” presented at SPIE Conf. Digital Radiology,1981.

[107] Rissanen, “Modeling by shortest data description,” Automatica, vol. 13, pp. 465-471, 1978.

[108] Myers, J. and Barrett, H. H., “Addition of a channel mechanism to the ideal-observer model,” J. Opt. Soc. Am. A, vol. 4, pp. 2447-2457, 1987.

[109] Burgess, E., “Comparison of receiver operating characteristic and forced choiceobserver performance measurement methods,” Med. Phys., vol. 22, pp. 643-655, 1995.

[110] LacCroix, K. J., Tsui, B. M. W., Frey, E. C., and Jaszczak, R. J., “ReceiverOperating characteristic evaluation of iterative reconstruction withattenuation correction in Tc-99m Sestamibi myocardial SPECT images,” J.Nuc. Med., vol. 41, pp. 502-513, 2001.

203

[111] Brankov, J. G., Yang, Y., Leahy, R. M., and Wernick, M. N., “Multi-modalitytomographic image reconstruction using mesh modeling,” presented at IEEEInter. Symp. on Biomedical Imaging: Macro to Nano, Washington, DC, July2002.

[112] Kay, S. M., Fundamentals of Statistical Signal Processing, vol. I. New Jersey:Prentice Hall, 1993.

[113] Bendriem, B. and Townsend, D. W., The Theory and Practice of 3D PET: KluwerAcademic Publishers, 1998.

[114] Shattuck, D. W., Sandor-Leahy, S. R., Schaper, K. A., Rottenberg, D. A., andLeahy, R. M., “Magnetic resonance image tissue classification using apartial volume model,” NeuroImage, vol. 13, pp. 856-876, May 2001.

[115] Sastry, S. and Carson, R. E., “Multimodality Bayesian algorithm for imagereconstruction in positron emission tomography: a tissue compositionmodel,” IEEE Tran. Med. Imaging, vol. 16, pp. 750 -761, Dec. 1997.

[116] Galatsanos, N., Wernick, M., and Katsaggelos, A., “Multichannel ImageRecovery,” in Handbook of Image and Video Processing, Bouvik, A., Ed.San Diego: Academic Press, 2000, pp. 155-168.

[117] Narayanan, M. V., King, M. A., Wernick, M. N., Byrne, C. L., Soares, E. J., andPretorius, P. H., “Improved image quality and computation reduction in 4Dreconstruction of cardiac gated SPECT,” IEEE Trans. Med. Imaging, vol.19, pp. 423-433, 2000.

[118] Asma, E., Nichols, T. E., Qi, J., and Leahy, R. M., “4D PET image reconstructionfrom list mode data,” presented at IEEE Nucl. Sci. Symp. and Med. ImagingConf., 2000.

[119] Galt, J. R., Garcia, E. V., and Robbins, W. L., “Effects of myocardial wallthickness on SPECT quantification,” IEEE Trans. Med. Imaging, vol. 9, pp.144 -150, 1990.

[120] Stojanovic, I. S., Osnovi Telekomunikacija. Beograd: Naucna Knjiga, 1990.

[121] Middleton, D., An Introduction to Statistical Communication Theory. NY:McGraw-Hill, 1960.

[122] Thomas, B. J., An Introduction to Statistical Communication Theory. NY: JohnWily & Sons, 1969.

[123] Mardia, K. V. and Jupp, P. E., Directional Statistics: John Wiley & Sons, LTD,1999.

204

[124] Frost, J. J., Douglass, K. H., Dannals, H. S., Links, J. M., Wilson, A. A., Ravert, H.T., Crozier, W. C., and Jr., H. N. W., “Multicompartmental analysis of[11C]-Carfentanil binding to opiate receptors in humans measured bypositron emission tomography,” J. Cereb. Blood Flow Metab., vol. 9, pp.398-409, 1989.

[125] Frost, J. J., Mayberg, H. S., Sandzot, B., Dannals, R. F., Lever, J. R., Ravert, H. T.,Wilson, A. A., Jr., H. N. W., and Links, J. M., “Comparison of[11C]Carfentanil binding to opiate receptors in humans by PositionEmission Tomography,” J Celeb. Blood Flow Metab., vol. 10, pp. 484-492,1990.

[126] Brankov, J. G., “Tomographic Image Reconstruction for Partially-Known Systemsand Image Sequences,” in M.S. Thesis ECE Dept. Chicago: Illinois Instituteof Technology, 1999, pp. 30-41.

[127] Linde, Y., Buzo, A., and Gray, R. M., “An algorithm for vector quantiser design,”IEEE Tran. Com., vol. 28, pp. 84-95, 1980.

[128] Tipping, M. E. and Bishop, C. M., “Mixtures of probabilistic principal componentanalyzers,” Neural Computing, vol. II, pp. 443-463, 1999.

[129] O'Ruanaidh, J. and Pun, T., “Rotation, scale and translation invariant spreadspectrum digital image watermarking,” Signal Processing, vol. 66, pp. 303-317, 1998.

[130] Lin, C. Y. and Wu, M., “Rotation, Scale, and Translation Resilient Publicwatermarking for images,” IEEE Trans. on Image Proc, vol. 10, May 2001.

[131] Pereira, S. and Pun, T., “Robust template matching for affine resistant imagewatermarks,” IEEE Trans. on Image Proc, vol. 9, June 2000.

[132] Petitcolas, F. A. P., In StirMark 3.1, :http://www.cl.cam.ac.uk/fapp2/watermarking/stirmark/, 1999.

[133] Davoine, F., “Triangular Meshes: A solution to resist to geometric distortionsbased watermark-removal softwares,” presented at European SignalProcessing Conference, Tampere, Finland, 2000.

[134] Cox, I. J., Kilian, J., Leighton, F. T., and Shamoon, T., “Secure spread spectrumwatermarking for multimedia,” IEEE Trans. on Image Proc., vol. 6, 1997.

[135] Nash, S. and Sofer, A., Linear and Nonlinear Programming: McGraw Hill, 1996.

MESH MODELING, RECONSTRUCTION AND SPATIO-TEMPORAL … · My thanks go to my teachers Mrs. Olas...

Documents

Transcript of MESH MODELING, RECONSTRUCTION AND SPATIO-TEMPORAL … · My thanks go to my teachers Mrs. Olas...