Lec16: Medical Image Registration (Advanced): Deformable Registration

Post on 22-Jan-2018

421 views 10 download

Transcript of Lec16: Medical Image Registration (Advanced): Deformable Registration

MEDICAL IMAGE COMPUTING (CAP 5937)

LECTURE 16: Medical Image Registration II (Advanced):Deformable Registration

Dr. Ulas BagciHEC 221, Center for Research in Computer Vision (CRCV), University of Central Florida (UCF), Orlando, FL 32814.bagci@ucf.edu or bagci@crcv.ucf.edu

1 SPRING 2017

Outline

•  Similarity Functions Revisited– MSE, NCC, MI, …

•  Deformable Image Registration– B-spline parameterization and Free Form

Deformation– Diffeomorphic image registration (DARTEL, …)

•  Optimization

2

I. SIMILARITY MEASURES: REVISITED

3

Standard Similarity Measures

•  SSD-SAD•  Cross correlation•  Mutual information

4

Difference Measure

5

6

Credit: Itk.org

Limitations of SSD

7

Limitations of SSD

8

Read: Bagci, U., et al. The role of Intensity Standardization in Medical Image Registration, PRL 2010.

Normalized Cross Correlation (NCC)

9

NCC-Example

10

Multi-modal Image Registration

11

Information Theoretic Approaches•  Mutual Information (MI)•  Normalized Mutual Information (NMI)

12

Preliminary: Histogram Calculation

13

Preliminary: Joint Histogram Calculation

14

Information Theoretic Approach

15

Joint Histogram

16

Joint Histogram

17

Entropy

18

Mutual Information

19

Historical Note

20

Normalization of MI

21

Improvements to MI

22

Requirements on Similarity Measure

23

II. Deformable Registration

24

Non-Rigid Deformation

25

Source

Target

Before

After

Reasons for Deformable Registration

26

Deformable Registration

27

Non-Rigid Registration 28

Deformable Registration General Framework

29

Deformation Fields

30

y

x

Deformation Fields

31

y

x

Deformable Registration General Framework

32

Why do we need regularization?

33

Why do we need regularization?

34

Regularization Strategies

35

Example: Elastic Registration•  Model the deformation as a physical process resembling the

stretching of an elastic material –  The physical process is governed by the internal force & external force–  described by the Navier linear elastic partial differential equation

•  The external force drives the registration process–  The external force can be the gradient of a similarity measure

•  e.g. local correlation measure based on intensities, intensity differences or intensity features such as edge and curvature

–  Or the distance between the curves and surfaces of corresponding anatomical structures.

36

Example: Free-Form Deformation (FFD)•  The general idea is to deform an image by manipulating a

regular grid of control points that are distributed across the image at an arbitrary mesh resolution.

•  Control points can be moved and the position of individual pixels between the control points is computed from the positions of surrounding control points.

37

Free Form Deformation

38

Credits: Sederberg and Parry, SIGGRAPH (1986)

Global Motion Model

39

(12 degrees of freedom)

Local Motion Model•  The affine transformation captures only the global motion.•  An additional transformation is required, which models the

local deformation

40

Local Motion Model•  The affine transformation captures only the global motion.•  An additional transformation is required, which models the

local deformation

41

FFD with B-Splines

42

B-Spline / Math

43

44

Deformation with B-Splines

45

Original Lena

Deformation with B-Splines

46

Deformed with B-Spline - Lena

B-Spline Parameterization

47

B-Splines Practically

48

B-Splines Practically

49

Interpolation•  For computation of the cost function, the new value is

evaluated at non-voxel positions, for which intensity interpolation is needed.

50

Interpolation•  For computation of the cost function, the new value is

evaluated at non-voxel positions, for which intensity interpolation is needed.

51

(nearest neighbor)

Interpolation•  For computation of the cost function, the new value is

evaluated at non-voxel positions, for which intensity interpolation is needed.

52

(tri-linear)

III. Optimization

53

Optimization

54

Iterative Optimization

55

Gradient Descent

56

Cost Function Derivative

57

58

59

60

61

62

Sampling Strategy•  The most straightforward strategy is to use all voxels

from the fixed image, which has as an obvious downside that it is time-consuming for large images.

63

Sampling Strategy•  The most straightforward strategy is to use all voxels

from the fixed image, which has as an obvious downside that it is time-consuming for large images.

•  A common approach is to use a subset of voxels, selected on a uniform grid, or sampled randomly!

64

Sampling Strategy•  The most straightforward strategy is to use all voxels

from the fixed image, which has as an obvious downside that it is time-consuming for large images.

•  A common approach is to use a subset of voxels, selected on a uniform grid, or sampled randomly!

•  Another strategy is to pick only those points that are located on striking image features, such as edges

65

Sampling Strategy•  The most straightforward strategy is to use all voxels

from the fixed image, which has as an obvious downside that it is time-consuming for large images.

•  A common approach is to use a subset of voxels, selected on a uniform grid, or sampled randomly!

•  Another strategy is to pick only those points that are located on striking image features, such as edges

•  2K samples are often enough for good performance!

66

Sampling Strategy•  The most straightforward strategy is to use all voxels

from the fixed image, which has as an obvious downside that it is time-consuming for large images.

•  A common approach is to use a subset of voxels, selected on a uniform grid, or sampled randomly!

•  Another strategy is to pick only those points that are located on striking image features, such as edges

•  2K samples are often enough for good performance!•  Masks can be used to select a region of interest or to

avoid the undesired alignment of artificial edges in the images

67

IV. Advance Models of Non-Rigid Deformations

68

Deformation Model

69

Small deformation models do not necessarily enforce a one-to-one mapping. If the inverse transformation is correct, these two should be identity!

Diffeomorphic Registration•  The large-deformation or diffeomorphic setting is a much

more elegant framework. A diffeomorphism is a globally one-to-one (objective) smooth and continuous mapping with derivatives that are invertible.

70

Diffeomorphic Registration•  The large-deformation or diffeomorphic setting is a much

more elegant framework. A diffeomorphism is a globally one-to-one (objective) smooth and continuous mapping with derivatives that are invertible.

•  If the mapping is not diffeomorphic, then topology is not necessarily preserved.

71

Diffeomorphic Registration•  The large-deformation or diffeomorphic setting is a much

more elegant framework. A diffeomorphism is a globally one-to-one (objective) smooth and continuous mapping with derivatives that are invertible.

•  If the mapping is not diffeomorphic, then topology is not necessarily preserved.

•  it is easier to parameterize using a number of velocity fields corresponding to different time periods over the course of the evolution of the diffeomorphism (consider u(t) as a velocity field at time t

72

Diffeomorphic Registration•  The large-deformation or diffeomorphic setting is a much

more elegant framework. A diffeomorphism is a globally one-to-one (objective) smooth and continuous mapping with derivatives that are invertible.

•  If the mapping is not diffeomorphic, then topology is not necessarily preserved.

•  it is easier to parameterize using a number of velocity fields corresponding to different time periods over the course of the evolution of the diffeomorphism (consider u(t) as a velocity field at time t

•  Diffeomorphisms are generated by initializing with an identity transform and integrating over unit time to obtain

73

Forward & Inverse Transform

74

DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)

DiffeomorphicAnatomicalRegistrationThroughExponentiatedLie Algebra

75

DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)

•  The DARTEL model assumes a flow field (u) that remains constant over time. With this model, the differential equation describing the evolution of a deformation is

76

DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)

•  The DARTEL model assumes a flow field (u) that remains constant over time. With this model, the differential equation describing the evolution of a deformation is

•  The Euler method is a simple integration approach to extend from identity (initial) transform, which involves computing new solutions after many successive small time-steps (h).

77

DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)

•  The DARTEL model assumes a flow field (u) that remains constant over time. With this model, the differential equation describing the evolution of a deformation is

•  The Euler method is a simple integration approach to extend from identity (initial) transform, which involves computing new solutions after many successive small time-steps (h).

•  Each of these Euler steps is equivalent to

78

DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)

•  The use of a large number of small time steps will produce a more accurate solution, for instance (8 steps)

79

Optimization of DARTEL•  Image registration procedures use a mathematical model to

explain the data. Such a model will contain a number of unknown parameters that describe how an image is deformed.

80

Optimization of DARTEL•  Image registration procedures use a mathematical model to

explain the data. Such a model will contain a number of unknown parameters that describe how an image is deformed.

•  A true diffeomorphism has an infinite number of dimensions and is infinitely differential.

81

Optimization of DARTEL•  Image registration procedures use a mathematical model to

explain the data. Such a model will contain a number of unknown parameters that describe how an image is deformed.

•  A true diffeomorphism has an infinite number of dimensions and is infinitely differential.

•  The discrete parameterization of the velocity field, u(x), can be considered as a linear combination of basis functions.

82

ρi(x) is the ith first degree B-spline basis function at position x

v is a vector of coefficients

Optimization of DARTEL•  The aim is to estimate the single “best” set of values for these

parameters (v). The objective function, which is the measure of “goodness”, is formulated as the most probable deformation, given the data (D).

83

Optimization of DARTEL•  The aim is to estimate the single “best” set of values for these

parameters (v). The objective function, which is the measure of “goodness”, is formulated as the most probable deformation, given the data (D).

•  The objective is to find the most probable parameter values and not the actual probability density, so this factor is ignored. The single most probable estimate of the parameters is known as the maximum a posteriori (MAP) estimate.

84

Optimization of DARTEL•  The aim is to estimate the single “best” set of values for these

parameters (v). The objective function, which is the measure of “goodness”, is formulated as the most probable deformation, given the data (D).

•  The objective is to find the most probable parameter values and not the actual probability density, so this factor is ignored. The single most probable estimate of the parameters is known as the maximum a posteriori (MAP) estimate.

Or

85

Optimization of DARTEL•  Many nonlinear registration approaches search for a

maximum a posteriori (MAP) estimate of the parameters defining the warps, which corresponds to the mode of the probability density

86

Optimization of DARTEL•  Many nonlinear registration approaches search for a

maximum a posteriori (MAP) estimate of the parameters defining the warps, which corresponds to the mode of the probability density

•  The Levenberg–Marquardt (LM) algorithm is a very good general purpose optimization strategy

87

Optimization of DARTEL•  Many nonlinear registration approaches search for a

maximum a posteriori (MAP) estimate of the parameters defining the warps, which corresponds to the mode of the probability density

•  The Levenberg–Marquardt (LM) algorithm is a very good general purpose optimization strategy

88

A: second order tensor field, H: concentration matrix, b: first derivative of likelihood func.

Optimization of DARTELSimultaneously minimize the sum of

–  Likelihood component•  Sum of squares difference

•  ½ ∑i∑k(tk(xi) – μk(φ(1)(xi)))2

•  φ(1) parameterized by u–  Prior component

•  A measure of deformation roughness

•  ½uTHu

89

Optimization of DARTELPRIOR TERM•  ½uTHu•  DARTEL has three different models for H

–  Membrane energy–  Linear elasticity–  Bending energy

•  H is very sparse•  H: deformation roughness

90

An example H for 2D registration of 6x6 images (linear elasticity)

Optimization of DARTEL

LIKELIHOOD TERM•  Images assumed to be partitioned into different tissue

classes.–  E.g., a 3 class registration simultaneously matches:

•  Grey matter with grey matter•  White matter wit white matter•  Background (1 – GM – WM) with background

91

“Membrane Energy”

Convolution Kernel Sparse Matrix Representation

Penalizes first derivatives. Sum of squares of the elements of the Jacobian (matrices) of the flow field.

“Bending Energy”

Sparse Matrix Representation Convolution Kernel

Penalizes second derivatives.

“Linear Elasticity”•  Decompose the Jacobian of the flow field into

–  Symmetric component•  ½(J+JT)•  Encodes non-rigid part.

–  Anti-symmetric component•  ½(J-JT)•  Encodes rigid-body part.

•  Penalise sum of squaresof symmetric part.

•  Trace of Jacobian encodes volume changes.Also penalized.

Gauss-Newton Optimization•  Uses Gauss-Newton

–  Requires a matrix solution to a very large set of equations at each iteration

u(k+1) = u(k) - (H+A)-1 b

–  b are the first derivatives of objective function–  A is a sparse matrix of second derivatives–  Computed efficiently, making use of scaling and squaring

95

Example

96

Large Deformation Model•  In the small deformation model, the displacements are stored

at each voxel w.r.t. their initial position.

97

Large Deformation Model•  In the small deformation model, the displacements are stored

at each voxel w.r.t. their initial position.•  Since regularization acts on the displacement field, highly

localized deformations cannot be modeled since the deformation energy caused by stress increases proportionally with the strength of the deformation.

98

Large Deformation Model•  In the small deformation model, the displacements are stored

at each voxel w.r.t. their initial position.•  Since regularization acts on the displacement field, highly

localized deformations cannot be modeled since the deformation energy caused by stress increases proportionally with the strength of the deformation.

•  In the large deformation model, the displacement is generated via a time dependent velocity field v

99

Large Deformation Model•  In the small deformation model, the displacements are stored

at each voxel w.r.t. their initial position.•  Since regularization acts on the displacement field, highly

localized deformations cannot be modeled since the deformation energy caused by stress increases proportionally with the strength of the deformation.

•  In the large deformation model, the displacement is generated via a time dependent velocity field v

100

where u(x,y,z,0)=(x,y,z).

Many Approaches are based on pre-processing step!

101

102

103

Image Gradients

104

Image Gradients

105

Image Gradients

106

107

Entropy Images

108

Entropy Images

109

Entropy Images

110

Entropy Images

111

112

Phase-based Registration

113

114

Multi-Resolution Registration

115

116

Attribute Vectors

117

Recent Approaches

118

Example: Joint Segmentation and Registration

119

Example: CNN based Registration!

120

address the two major limitations of existing intensity-based 2-D/3-D registration technology: 1)  slow computation and 2)   small capture range. directly estimates transformation parameters unlike iterative optimization

Example: CNN based Registration!1.  Parameter Space Partition (PSP)

–  partition the transformation parameter space into zones and train CNN regressors in each zone separately, to break down the complex regression task into multiple simpler sub-tasks.

2.  Local Image Residual (LIR)–  simplify the underlying mapping to be captured by the regressors. LIR

is calculated as the difference between the DRR rendered using transformation parameters t, and the X-ray image in local patches.

3.  Hierarchical Parameter Regression–  decompose the transformation parameters and regress them in a

hierarchical manner, to achieve highly accurate parameter estimation in each step.

4.  CNN Regression Model

121

Example: CNN based Registration!

122

Workflow of LIR feature extraction, demonstrated on X-ray Echo Fusion data. The local ROIs determined by the 3-D points and the transformation parameters are shown as red boxes. The blue box shows a large ROI that covers the entire object.

EVALUATION

123

124

Intensity Based Metrics

125

Landmark Based Metrics•  Identify a set of anatomical landmarks, and measure the

distance between them before/after registration

126

Landmark Based Metrics•  Non-rigid registration error can be quantified with certainty

only at the available landmarks and with increasing uncertainty as distance from these landmarks increases. 

127

Landmark Based Metrics•  Non-rigid registration error can be quantified with certainty

only at the available landmarks and with increasing uncertainty as distance from these landmarks increases.

•  A very dense set of landmarks, which is generally not available, would thus be required to gain a complete, global understanding of registration accuracy.  

128

Segmentation Based Metrics•  Tissue overlaps (after segmentation)

129

Jacobian Determinants•  The Jacobian determinant is used to define a measure for the

plausibility of the transformation by calculating the number of voxels with a negative value or a value of zero:

130

A value det(∇φ(x)) > 1 implies an expansion of volume, 0 < det(∇φ(x)) < 1 indicates a contraction φ(x): deformation field

However,…..•  Intensity, landmark, and segmentation based methods are

highly unreliable !!!!

131

CURT: Completely Useless Registration Tool

However,…..•  Intensity, landmark, and segmentation based methods are

highly unreliable !!!!

132

CURT: Completely Useless Registration Tool

Tissue overlap measures (Jaccard index)

Inverse Consistency Error•  For each pair of fixed and moving images, A and B, and for

each registration algorithm, the inverse consistency error  between the forward transformation, TAB, and the backward transformation, TBA, is computed as follows:

133

Rohlfing’s TMI Paper (2012) suggests that•  tissue overlap, image similarity, and inverse consistency error

are not reliable surrogates for registration accuracy, whether they are used in isolation or combined. 

134

Rohlfing’s TMI Paper (2012) suggests that•  tissue overlap, image similarity, and inverse consistency error

are not reliable surrogates for registration accuracy, whether they are used in isolation or combined. 

•  Ideally, actual registration errors measured at a large number of densely distributed landmarks (i.e., identifiable anatomical locations) should become the standard for reporting registration errors.

135

Rohlfing’s TMI Paper (2012) suggests that•  tissue overlap, image similarity, and inverse consistency error

are not reliable surrogates for registration accuracy, whether they are used in isolation or combined. 

•  Ideally, actual registration errors measured at a large number of densely distributed landmarks (i.e., identifiable anatomical locations) should become the standard for reporting registration errors.

•  make the evaluation of registration performance as independent as possible from the registration itself. Some obvious dependencies exist between images used for registration and features derived from them for evaluation. 

136

Summary

137

Slide Credits and References•  Darko Zikic, MICCAI 2010 Tutorial•  Stefan Klein, MICCAI 2010 Tutorial•  Marius Staring, MICCAI 2010 Tutorial•  J. Ashburner, NeuroImage 2007.•  M. F. Beg, M. I. Miller, A. Trouvé and L. Younes. “Computing

Large Deformation Metric Mappings via Geodesic Flows of Diffeomorphisms”. International Journal of Computer Vision 61(2):139–157 (2005).

•  M. Vaillant, M. I. Miller, L. Younes and A. Trouvé. “Statistics on diffeomorphisms via tangent space representations”. NeuroImage 23:S161–S169 (2004).

•  L. Younes, “Jacobi fields in groups of diffeomorphisms and applications”. Quart. Appl. Math. 65:113–134 (2007).

•  www.nirep.org•  Rueckert and Schnabel’s Chapter 5, Medical Image

Registration, Springer 2010. Biomedical Image Processing.

138