An Algorithm to Compute Independent Sets of Voxels for Parallelization of ICD-based Statistical...

An Algorithm to ComputeIndependent Sets of Voxels for Parallelization of ICD-based Statistical Iterative Reconstruction

Sungsoo Ha and Klaus Mueller

Department of Computer Science

Visual Analytics and Imaging (VAI) Lab

Stony Brook University and SUNY Korea

Motivation

• Statistical Iterative Reconstruction Algorithm

FBP SIR

Motivation

• Statistical Iterative Reconstruction Algorithm• Weighted Least Square (WLS) cost function

�̂�=arg min𝑥 ≥ 0 {1

2(𝐲−𝐀𝐱 )𝑇𝐖 (𝐲−𝐀𝐱 )+𝑅 (𝐱 )}

y Measured projection data

X Attenuation coefficients of the object subject to be reconstructed

A System matrix with size of

W Diagonal matrix for statistical weighting

R(x) Regularization

Motivation

• Statistical Iterative Reconstruction Algorithm• Weighted Least Square (WLS) cost function

High cost for forward & back projectionsThe nature of iterative algorithm

�̂�=arg min𝑥 ≥ 0 {1

2(𝐲−𝐀𝐱 )𝑇𝐖 (𝐲−𝐀𝐱 )+𝑅 (𝐱 )}

Motivation: optimization

ICD-based CG-based

FAST SLOWConvergence rate

HARD EASYParallelization

GCD (Fessler et al. 1997)

B-ICD(Benson et al. 2010)

ABCD(Fessler et al. 2011)

• Devise an algorithm– Find voxels that are “fully” independent each other– No additional algorithmic & computational complexity– More accurate (also complicated) pattern– Applicable for all CT geometry

ICD-based GC-based

FAST SLOWConvergence rate

HARD EASYParallelization

Independency among voxels

• Single voxel update scheme–Minimizing one direction at a time

correction weighting update

Single voxel update

A voxel A

object

x-ray source

flat detector

region related to voxel A

A voxel A

object

x-ray source

flat detector

region related to voxel A

B voxel B

region related to voxel B

Independent voxel

System Matrix, - M: # of line-integrals- N: # of voxels

Overlap between B & C

CT system matrix view

• Independent– A, B

• Dependent – A, C– B, C

Overlap between A & C

• Knapsack problem:

Finding set of independent voxels

min ZERO {¿𝑔∈𝐺𝑔 }𝑠 .𝑡 .𝐺= {𝑎𝑘∨1≤𝑘≤ N }

𝑎𝑚∩𝑎𝑛=𝟎∀𝑎𝑚𝑎𝑛∈𝐺 ,𝑚≠𝑛

• Knapsack problem:

• Combinatorial NP-hard problem

min ZERO {¿𝑔∈𝐺𝑔 }𝑠 . 𝑡 .𝐺={𝑎𝑘∨1≤𝑘≤ N }

Finding set of independent voxels

A B C D E F AG = B CX

min ZERO {¿𝑔∈𝐺𝑔 }𝑠 . 𝑡 .𝐺={𝑎𝑘∨1≤𝑘≤ N }

Finding set of independent voxels• Knapsack problem:

• Combinatorial NP-hard problem• First-Fit Decreasing algorithm

1. Sort voxels in descending order of the number of non-zero elements

2. Pick the voxel that contain the largest number of non-zero elements

3. Invalidate all voxels that depend on the selected voxel

Experiment settings

• Cone-beam CT geometry• Volume: 128 x 128 x 128 (1 x 1 x 1 mm)• Flat detector: 512 x 512 (1 x 1 mm)• SAD: 600 mm• SID: 1000 mm• The number of projections– Varying from 1 to 360– Uniformly distributed over 360 degrees

Extreme case study

# views# independent

groupMax. size of

independent groupAvg. size of

independent group

1 187 16,186 11,214

360 13,569 449 154

• ABCD (Axial Block Coordinate Descent) algorithm• Along z-direction: 128

More parallelism No additional complexity

Theoretical parallelism

# views# independent

groupMax. size of

independent groupAvg. size of

independent group

1 187 16,186 11,214

360 13,569 449 154

• Expected speed-up (theoretical parallelism) with ideal GPU implementation

Estimated gain of GPU-accelerated OS-SIR

𝒈𝒂𝒊𝒏𝑶𝑺−𝑺𝑰𝑹𝑮𝑷𝑼

𝑔𝑎𝑖𝑛𝑂𝑆−𝑆𝐼𝑅𝐺𝑃𝑈 =

𝑝𝑎𝑟𝑎𝑙𝑙𝑒𝑙𝑖𝑠𝑚(360 /¿𝑜𝑓 𝑣𝑖𝑒𝑤𝑠𝑝𝑒𝑟 𝑠𝑢𝑏𝑠𝑒𝑡)

𝒑𝒂𝒓𝒂𝒍𝒍𝒆𝒍𝒊𝒔𝒎= 𝟏𝟐𝟖𝟑

¿𝒖𝒑𝒅𝒂𝒕𝒆𝒔

Number of views / subset

Independence visualization

32 (bottom) 64 (middle) 96 (top) 32 (bottom) 64 (middle) 96 (top)

• At 360 views

32 (bottom) 96 (top)

• A clue for optimism

32 (bottom) 96 (top)

1 view

360 views

Conclusion & Future works

• More parallelism than existing methods– No additional complexity– One time computation– Applicable for all CT geometry

• Hints for GPU implementation of SIR

• Apply to actual GPU-accelerated SIR framework– Determine optimal computational performance– Convergence rate

Thanks!

• Q&A

• This research was partially supported by NSF grant IIS-11732 and the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ‘IT Consilience Creative Program (ITCCP)’ (NIPA-2013-H0203-13-1001) supervised by NIPA (National IT Industry Promotion Agency).

An Algorithm to Compute Independent Sets of Voxels for Parallelization of ICD-based Statistical...

Documents

Transcript of An Algorithm to Compute Independent Sets of Voxels for Parallelization of ICD-based Statistical...

Flexible Voxels for Motion-Aware Videographygraphics.cs.cmu.edu/projects/FlexibleVoxels/files/FlexibleVoxels.pdfFlexible Voxels for Motion-Aware Videography Mohit Gupta1, Amit Agrawal2,

OPTIMIZATION AND OPENMP PARALLELIZATION OF …€¦ · OPTIMIZATION AND OPENMP PARALLELIZATION OF A DISCRETE ELEMENT ... with the optimization and parallelization of a discrete element

Parallelization Overheads - Western Universitymoreno/Publications/SHARCNET_Tutorial-2.pdf · 4 We call this cost parallelization overheads. 5 We will see that parallelization overheads

Parallelization of Explicit and Implicit Solver · — Parallelization of Explicit and Implicit Solver — CFD08-9 Parallelization and Iterative Solver Rolf Rabenseifner Slide 17

Voxels for Unity: Manual (1.0)

Trend Towards Parallelization

Loop parallelization & pipelining

Parallelization - XS4ALL Klantenservice

3-DObject Reconstruction Using Spatially Extended Voxels … · 3-DObject Reconstruction Using Spatially Extended Voxels and Multi-HypothesisVoxel Coloring Eckehard Steinbach and

.Net Multithreading and Parallelization

Acoustic Voxels: Dingzeyu Li, David I.W. Levin, … · ssdssd dsd sd Acoustic Voxels: Computational Optimization of Modular Acoustic Filters Dingzeyu Li, David I.W. Levin, Wojciech

Voxels based on Madness Road

Parallelization in Molecular Dynamics

This straightforward 3-D display algorithm traverses voxels

Parallelization - cons.mit.edu

1 IE 531 Linear Programming Spring 2015 Sungsoo Park.

Parallelization & Multicore

The need for parallelization Challenges towards effective parallelization A multilevel parallelization framework for BEM: A compute intensive application.

Parallelization at a Glance

Voxels in LittleBigPlanet 2