Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized...

26
Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department of Computer Science Kent State University
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    2

Transcript of Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized...

Page 1: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Acceleration of Medical Image Registration using Graphics Process Units

in Computing Normalized Mutual Information

Wei-Hung Cheng, Cheng-Chang Lu

Department of Computer Science

Kent State University

Page 2: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

IntroductionThe computing capacities of graphics processing units (GPUs) have improved exponentially in the recent decade.

NVIDIA released a CUDA programming model for GPUs.

The CUDA programming environment applies the parallel processing capabilities of the GPUs to medical image processing research.

September, 2009 Kent State University 2

Page 3: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Multi-resolution Approach RegistrationWhy Multi-resolution?

Methods for detecting optimality can not guarantee that a global optimal value will be found.

Time to evaluate the registration criterion is proportional to the number of voxels.

The result at coarser level is used as the starting point for the finer level.

Using multi-resolution in conjunction with maximization of mutual information has been proven very helpful in registration processes.

Multi-resolution is supplemented with binarization giving improved accuracy

.

September, 2009 Kent State University 3

Page 4: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

A Binarization Registration Approaches

Background segmentation.

Linear binning size.

Using sub-sampling multi-resolution.

For CT-MR registration.

September, 2009 Kent State University 4

Page 5: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Two stages:

1. Region Growing:Segmentation into Background and Foreground.

2. Two levels Registration:Binarized 2-bin images are input to the lower level.

Down-sampled binarized images as the input to the first level.

Result of the first level as the initial estimate for the second level.

The second level performs the registration of full images, using Maximization of Normalized Mutual Information.

September, 2009 Kent State University 5

Page 6: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Normalized Mutual InformationMutual information

It is applied to measure the statistic dependence between the image intensities of corresponding voxels in both images, which is assumed to be maximal if the images are geometrically aligned.

September, 2009 Kent State University 6

a b BA

ABAB bPaP

baPbaPBAMI

)()(

),(log),(),(

)|()(

)|()(

),()()(

ABHBH

BAHAH

BAHBHAH

Page 7: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

September, 2009 Kent State University 7

Normalized Mutual Information

Studholme et. al.:

Where

and

Compensate for the sensitivity of MI to changes in image overlap

),(

)()(),(

BAH

BHAHBANMI

)(log)()( aPaPAH Aa

A

),(log),(),( baPbaPBAH ABa b

AB

Page 8: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

A Binarization ApproachRegion Growing Implementation

Finding Starting Points:Easy to select a point as seed for background.

Similarity Criteria:Threshold T can be extracted from the histogram

September, 2009 Kent State University 8

Typical histogram for CT image (left) & MR image (right)

Page 9: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

A Binarization Approach Median and maximum error between the prospective gold-standard

and several retrospective registration techniques. Ours is labeled as LO1. Median Error

Maximum Error

September, 2009 Kent State University 9

Page 10: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Non-Linear Binning Approach Background segmentation

Segmented image as input to K-means Clustering1. Initially partition the image voxels into k bins where k = 256.

1a. Put all the background voxels into bin 0.1b. Calculate the step size for the other k-1 bins using

Each bin will be assigned all voxels whose intensity falls within the range of its boundary.

1c. Calculate the centroid of each bin.

September, 2009 Kent State University 10

1

k

tyMinIntensityMaxIntensi

Page 11: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Non-Linear Binning Approach

K-means Clustering (cont.)2. For each voxel in the image, compute the distances to the

centroids of its current,previous, and next bin, if exists; if it is not currently in the bin with the closest centroid, switch it to that bin, and update the centroids of both bins.

3. Repeat step 2 until convergence is achieved; that is, continue until a pass through all the voxels in the image causes no new assignments or until a maximization iterations is reached where the maximization iterations = 500.

Two-level Registration

September, 2009 Kent State University 11

Page 12: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Non-Linear Binning Approach Median and maximum error between the prospective gold-standard

and several retrospective registration techniques. Ours is labeled as LO2. Median Error

Maximum Error

September, 2009 Kent State University 12

Page 13: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Wavelet-based Multi-resolution Approach

Description: Multi-resolution: Improve optimization speed and capture range. The wavelet intends to transform images into a

multi-scale representation. A wavelet can be created by passing the image through a series of filter bank stages.

September, 2009 Kent State University 13

Page 14: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Wavelet-based Multi-resolution Approach

Implementation: Daubechies Wavelet filter coefficients ( DAUB4 )

Four-level WT on 41 CT-MR pairs registration.

Three-level WT on 35 PET-MR pairs registration

September, 2009 Kent State University 14

Page 15: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Wavelet-based Multi-resolution Approach

Result (cont.)

September, 2009 Kent State University 15

A typical superposition of PET-MR images.

Left : before registration Right: after registration.

Page 16: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Wavelet-based Multi-resolution Approach

Median and maximum error between the prospective gold-standard and several retrospective registration techniques. Ours is labeled as LO3. Median Error

Maximum Error

September, 2009 Kent State University 16

Page 17: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Wavelet-based Multi-resolution Approach

Median and maximum error between the prospective gold-standard and several retrospective registration techniques. Ours is labeled as LO3. Median Error

Maximum Error

September, 2009 Kent State University 17

Page 18: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

September, 2009 Kent State University 18

CUDA Programming Model A parallel programming model and software environment designed to

handle parallel computing tasks.

Similar to the traditional single instruction, multiple data (SIMD) parallel model.

Major abstractions:

– a hierarchy of thread groups

– shared memories

– barrier synchronization

It provide a programming model for data parallelism, thread parallelism, and task parallelism.

Page 19: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Computation paradigm of the CUDA A program is divided into blocks.

– A block is a group of threads mapped to a single multiprocessor by the programmer to share the memory.

The data is also divided amongst all threads in a SIMD fashion by the programmer.

All threads are organized into warps.

– Each warp is a group of 32 parallel scale threads, which can run concurrently on the multi-processors.

Collections of warps are known as thread block

September, 2009 Kent State University 19

Host

Kernel 1

Kernel 2

Device

Grid 1

Block(0, 0)

Block(1, 0)

Block(0, 1)

Block(1, 1)

Grid 2

Courtesy: NDVIA

Figure 3.2. An Example of CUDA Thread Organization.

Block (1, 1)

Thread(0,1,0)

Thread(1,1,0)

Thread(2,1,0)

Thread(3,1,0)

Thread(0,0,0)

Thread(1,0,0)

Thread(2,0,0)

Thread(3,0,0)

(0,0,1) (1,0,1) (2,0,1) (3,0,1)

Page 20: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Memory hierarchy is in the form of registers, constant memory, global memory, and textures.

– registers: fastest level in the hierarchy, a limited amount of space.

– constant memory: a subset of device memory, cannot be modified at run-time by a device.

– global memory: permits read and write operation from all threads, but is uncached and has long latencies.

– textures memory: a subset of the device memory, read-only on the device, faster cached reads, allows addressing through a specialized texture unit.

September, 2009 Kent State University 20

Page 21: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

September, 2009 Kent State University 21

CUDA Algorithm for NMI registration The registration procedure spent 90-95% of its run-time on mutual information

computation. Medical image registration has a high level of data parallelism and image data

can be mapped onto the GPUs platform. Based on our normalized mutual information method, the tasks are classified

into four CUDA kernels as follows:

Transformation – This group performs coordinate transform, affine transform, and mapping matrix to establish spatial correspondence between two images.

Interpolation – This group involves iteratively transforming image A with respect to image B while optimizing the MI measure which is

calculated from corresponding voxel values.

Histogram – This group computes a joint histogram of the pairs of images to evaluate the mutual information.

Optimization – This group detects optimization of estimate transformation to evaluate its similarity

Page 22: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

September, 2009 Kent State University 22

The CUDA implementation consists of four stages:

1) Allocate data memory on the device and transfer them from the host to the device.

2) Set up the function kernel configuration.

3) Launch function kernel(s) and store the result in the device memory.

4) Transfer data from the device memory to the host memory.

Page 23: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

CUDA Implementation Experimental Results

• The experiments involved the data sets of 7 patients, each consisting of Computed Tomography (CT) and six Magnetic-Resonance (MR) volumes.

• On a PC, having a 2.40 GHz Intel® Core™ 2 Quad CPUs, and 4 GB DDR2 memory with NVIDIA’s GeForce 9600 GT graphic card.

• All CT images were registered to the MR images using the MR image as the reference image on PC

• Run the registration procedure on both for the CPU-base platform (C program), and the GPUs platform (the CUDA program).

• Experimental results showed that the GPU implementation improves the registration computational performance with a speedup factor of 23.4×

September, 2009 Kent State University 23

Page 24: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

September, 2009 Kent State University 24

Table: Data Set Image Characteristics

Image Size Image Size Voxels(mm)

CT 5122 x (28-34) 0.6535952 x 4.0

MR PD 2562 x (20-26) 1.252 x 4.0

MR PD Re. 2562 x (20-26) (1.25 – 1.26) 2 x (4.04 - 4.11)

MR T1 2562 x (20-26) 1.252 x 4.0

MR T1 Re. 2562 x (20-26) (1.25 – 1.26) 2 x (4.04 - 4.12)

MR T2 2562 x (20-26) 1.252 x 4.0

MR T2 Re. 2562 x (20-26) (1.25 – 1.27) 2 x (4.04 - 4.16)

Page 25: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

Comparison of GPU and CPU-based implementation for the registration procedure runtimes

September, 2009 Kent State University 25

Run Time Range for 41 pairs data set on CPU and GPU

Architecture

Run Time (mins.) Average (mins.) Speedup

CPU-based 7.41 ~ 18.34 12.20 1

GPU-based 0.33 ~ 0.633 0.5 23.4

Page 26: Acceleration of Medical Image Registration using Graphics Process Units in Computing Normalized Mutual Information Wei-Hung Cheng, Cheng-Chang Lu Department.

September, 2009 Kent State University 26

References

• A. Collignon, F. Maes, D. Delaere, D. Vandermeulen, P. Suetens, and G. Marchal. Automated multi-modality image registration based on information theory. Information Process. Med. Imaging, pages 263–274, 1995.

• R. Gonzalez and R. Woods. Digital Image Procesing. Prentice Hall Press, 2 edition, 2002.

• M. Harris. Mapping computational concepts to GPUs. GPU Gems 2, chapter 31. Addison Wesley, Mar. 2005.

• F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens. Multimodality image registration by maximization of mutual information. IEEE Trans. Med. Imaging., 16(2):187–198, 1997.

• J. B. A. Maintz and M. A. Viergever. A survey of medical image registration. Medical Image Analysis., 2(1):1–36, 1998.

• NVIDIA. Nvidia cuda compute unified device architecture. Programming Guide, Version 2.0. NVIDIA, 2008.

• J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone, and J. C. Phillips. Cpu computing. Proceedings of the IEEE, 96(5):879–899, 2008.

• D. L. Pham, C. Xu, and J. L. Prince. Current methods in medical images segementation. Annual Review of Biomedical Engineering, 2, 2000.

• W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Numerical recipes in c: The art of scientific computing. 2nded. Cambridge, U.K.:Cambridge University Press, pages 394–455, 1999.

• C. Studholme, D. L. G. Hill, and D. J. Hawkes. An overlap invariant entropy measure of 3d medical image alignment. Pattern Recognition., 32(1):71–86, 1999.

• W. M. WellIII, P. Viola, H. Atsumi, S. Nakajima, and R. Kikinis. Multi-modal volume registration by maximization of mutual information. Medical Image Analysis.,