Softassign and EM-ICP on GPU

34
Toru Tamaki, Miho Abe, Bisser Raytchev, Kazufumi Kaneda 19 th Nov. 2010

description

Toru Tamaki, Miho Abe, Bisser Raytchev, Kazufumi Kaneda: "Softassign and EM-ICP on GPU", Proc. of UPDAS2010; The 2nd Workshop on Ultra Performance and Dependable Acceleration Systems, In Proc. of ICNC'10, pp.179-183 (2010 11), Higashi Hiroshima, Japan, November 17-19, 2010. Toru Tamaki, Miho Abe, Bisser Raytchev, Kazufumi Kaneda, Marcos Slomp: "CUDA-based implementations of Softassign and EM-ICP," Demonstration presented at CVPR2010 ; IEEE Conference on Computer Vision and Pattern Recognition, June 15-17, 2010, Hyatt Regency San Francisco, San Francisco, USA.

Transcript of Softassign and EM-ICP on GPU

Page 1: Softassign and EM-ICP on GPU

Toru Tamaki, Miho Abe, Bisser Raytchev, Kazufumi Kaneda

19th Nov. 2010

Page 2: Softassign and EM-ICP on GPU

Contribution of this talk

Fast GPU implementations of registration algorithms for 3D point sets.

Softassign [Gold et al., 1998]

EM-ICP [Granger et al., 2002]

(Weighted) Horn’s method [Horn, 1987]

So, what is “registartion” ?

Page 3: Softassign and EM-ICP on GPU

What is “Registration” or “Alignment” ?

A set of images

Image registration

Page 4: Softassign and EM-ICP on GPU

Registration of 3D point sets 大石岳史,増田智仁,倉爪亮,池内克史,創建期奈良大仏及び大仏殿のデジタル復元,日本バーチャルリアリティ学会論文誌, Vol. 10, No. 3, pp.429-436, 2005.10.

A statue

Range data from one

view

Range data from

another view

Aligned (registered) 3d point cloud

An example of rendered CG image of the statue

Page 5: Softassign and EM-ICP on GPU

3D registration algorithm

Input

Two point sets: 𝑋 and 𝑌

Output

Rotation matrix 𝑅

Translation vector 𝒕

X Y

𝑅 and 𝒕

Page 6: Softassign and EM-ICP on GPU

Algorithms for registration

Horn’s method

• Corresponding point sets are given.

• Estimate R and t.

ICP (Iterative closest point)

• Unknown correspondence.

• Fast, standard.

• Easily fail due to local minimum.

• A lot of variants follow.

Softassign

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

EM-ICP

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

Registration algorithm

Page 7: Softassign and EM-ICP on GPU

Algorithms for registration

Horn’s method

• Corresponding point sets are given.

• Estimate R and t.

ICP (Iterative closest point)

• Unknown correspondence.

• Fast, standard.

• Easily fail due to local minimum.

• A lot of variants follow.

Softassign

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

EM-ICP

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

Registration algorithm

Page 8: Softassign and EM-ICP on GPU

Horn’s method: correspondence is known.

𝑋 𝑌

X Y

?

Unknown correspondence

X Y

Known correspondence

𝒙1 𝒚1

𝒙2 𝒚2

⋮ ⋮

𝑇

𝑇

𝑇

𝑇

𝒙1 = (𝑥1𝑥, 𝑥1𝑦, 𝑥1𝑧)𝑇

Page 9: Softassign and EM-ICP on GPU

Horn’s method: correspondence is known.

𝑋 𝑌

𝒙1 𝒚1

𝒙2 𝒚2

⋮ ⋮

𝑇

𝑇

𝑇

𝑇

𝒙 𝒚

Compute centers

𝑋 𝑌

Centering

𝑋 − 𝒙 𝑌 − 𝒚

𝑋 𝑌 𝑆 =

𝐾 =

Computer 1st Eigenvector 𝒒 : quaternion 𝑞 Convert 𝑞 to 𝑅

𝒕 = 𝒙 − 𝑅𝒚 1 2

3

4

5

Page 10: Softassign and EM-ICP on GPU

Algorithms for registration

Horn’s method

• Corresponding point sets are given.

• Estimate R and t.

ICP (Iterative closest point)

• Unknown correspondence.

• Fast, standard.

• Easily fail due to local minimum.

• A lot of variants follow.

Softassign

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

EM-ICP

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

Registration algorithm

Page 11: Softassign and EM-ICP on GPU

ICP: correspondence is unknown.

𝑋 𝑌

𝒙1 𝒚1

𝒙2 𝒚2

⋮ ⋮

𝑇

𝑇

𝑇

𝑇

Find closest (nearest) point to 𝒙1 in 𝑌

𝑌∗

𝒚𝑖

𝒚𝑖

Put the point to 𝑌∗

Page 12: Softassign and EM-ICP on GPU

ICP: correspondence is unknown.

𝑋 𝑌

𝒙1 𝒚1

𝒙2 𝒚2

⋮ ⋮

𝑇

𝑇

𝑇

𝑇

Find closest (nearest) point to 𝒙1 in 𝑌

𝑌∗

𝒚𝑗

𝒚𝑖

Put the point to 𝑌∗

𝒚𝑗

Horn’s method with 𝑋 and 𝑌∗

Estimate 𝑅 and 𝒕

Page 13: Softassign and EM-ICP on GPU

ICP: correspondence is unknown.

𝑋 𝑅𝑌 + 𝒕

𝒙1 𝒚1

𝒙2 𝒚2

⋮ ⋮

𝑇

𝑇

𝑇

𝑇

Find closest (nearest) point to 𝒙1 in 𝑌

𝑌∗

𝒚𝑗

𝒚𝑖

Put the point to 𝑌∗

𝒚𝑗

Horn’s method with 𝑋 and 𝑌∗

Estimate 𝑅 and 𝒕

Repeat

Fast, but easy to fail due to hard correspondence.

Page 14: Softassign and EM-ICP on GPU

Algorithms for registration

Horn’s method

• Corresponding point sets are given.

• Estimate R and t.

ICP (Iterative closest point)

• Unknown correspondence.

• Fast, standard.

• Easily fail due to local minimum.

• A lot of variants follow.

Softassign

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

EM-ICP

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

Registration algorithm

Page 15: Softassign and EM-ICP on GPU

Softassign: soft correspondence.

𝑋

𝑌

𝒙𝑖

𝒚𝑗

𝑚𝑖𝑗

𝑚𝑖𝑗 = ||𝒙𝑖 − 𝑅𝒚𝑗 + 𝒕 ||

𝑀 Weighted Horn’s method with 𝑋 and 𝑌

Estimate 𝑅 and 𝒕

Repeat

Each row and column should be normalized to 1 by Shinkhorn iterations

Page 16: Softassign and EM-ICP on GPU

Shinkhorn iterations

𝑀

Each row and column should be normalized to 1 by Shinkhorn iterations

𝑚𝑖𝑗

sum up to 1

sum up to 1

sum up to 1

sum up to 1

Repeat row and column normalization until converge.

Page 17: Softassign and EM-ICP on GPU

Shinkhorn iterations

𝑀

Each row and column should be normalized to 1 by Shinkhorn iterations

𝑚𝑖𝑗

sum

up

to 1

sum

up

to 1

sum

up

to 1

sum

up

to 1

Repeat row and column normalization until converge.

Page 18: Softassign and EM-ICP on GPU

Shinkhorn.GPU (row normalization)

𝑀

Each row and column should be normalized to 1 by Shinkhorn iterations

𝟏

1 1 1 ⋮

𝑹𝑀

Using sgemv of CUBLAS

Page 19: Softassign and EM-ICP on GPU

Shinkhorn.GPU (row normalization)

𝑀

Each row and column should be normalized to 1 by Shinkhorn iterations

𝑹𝑀

Using CUDA kernel

Row-wise division

Column normalization is done by the same way.

Page 20: Softassign and EM-ICP on GPU

Weighted Horn’s method

𝑋 𝑌 𝑆 = 𝑋 𝑌 𝑆 = 𝑀

3 3

Normal version Weighted version

Using CUBLAS sgemv twice.

Page 21: Softassign and EM-ICP on GPU

Centering.GPU (weighted version)

𝑋

𝑹𝑀 𝟏

1 1 1 ⋮

𝑋

∗ ∗

CUDA kernel

CUBLAS sasum

𝑹𝑀 𝟏

CUBLAS sasum

𝒙

Weighted center

Same as for 𝒚

Weighted sum

Page 22: Softassign and EM-ICP on GPU

Pipeline of Softassing.GPU

𝑋

𝑌

𝑋

𝑌

𝑀

𝑋 𝑌 𝑆 = 𝑀

Compute 𝑀 with CUDA kernel

Shinkhorn.GPU

Centering.GPU

𝑆

Weighted Horn’s method

𝐾

𝑅 and 𝒕

Solve Eigenvalue problem

𝒙 , 𝒚

Page 23: Softassign and EM-ICP on GPU

Algorithms for registration

Horn’s method

• Corresponding point sets are given.

• Estimate R and t.

ICP (Iterative closest point)

• Unknown correspondence.

• Fast, standard.

• Easily fail due to local minimum.

• A lot of variants follow.

Softassign

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

EM-ICP

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

Registration algorithm

Page 24: Softassign and EM-ICP on GPU

EM-ICP: soft correspondence.

𝑌

𝑋

𝒚𝑖

𝒙𝑗

𝑑𝑖𝑗

𝑑𝑖𝑗 = ||𝒙𝑗 − 𝑅𝒚𝑖 + 𝒕 ||

𝐴 Weighted Horn’s method with 𝑋′ and 𝑌

Estimate 𝑅 and 𝒕

Repeat

𝑋′

𝒙′𝑖

Pseudo correspondence 𝑋′

Each row is normalized once.

Page 25: Softassign and EM-ICP on GPU

Row normalization on GPU

𝐴

𝟏

1 1 1 ⋮

𝑪

Using sgemv of CUBLAS

Not normalized yet.

Page 26: Softassign and EM-ICP on GPU

Row normalization on GPU

𝐴

Using CUDA kernel

Row-wise division

+ sqrt

𝑪

Now normalized.

Page 27: Softassign and EM-ICP on GPU

Computing weights

𝐴

𝟏

1 1 1 ⋮

𝝀

Using sgemv of CUBLAS

Now normalized.

Page 28: Softassign and EM-ICP on GPU

Pseudo correspondence

𝑋

𝐴

𝑋′

CUBLAS sgemv

Centering: same with Softassing.GPU

Now normalized.

Page 29: Softassign and EM-ICP on GPU

Weighted Horn’s method

𝑋′ 𝑌 𝑆 =

3

Weighted version

0

0 𝜆1

𝜆2

𝝀 𝑋′

CUDA kernel

𝑋

𝑋 ’ 𝑌 𝑆 =

CUBLAS sgemm

3

Weighted version (2 steps)

(not efficient)

Page 30: Softassign and EM-ICP on GPU

Pipeline of EM-ICP.GPU

𝑋

𝑌

𝑋

𝑌

𝐴

Compute 𝐴with CUDA kernel

Row normalization on GPU

Centering.GPU

𝑆

2 step weighted Horn’s method

𝐾

𝑅 and 𝒕

Solve Eigenvalue problem

𝒙 , 𝒚

𝝀 𝑋′

𝑋

𝑋′

𝑌

𝑆 =

Page 31: Softassign and EM-ICP on GPU

Computing time over different number of points

Successfully aligned 5000 points less than 7 seconds.

Slightly fast, but failed.

GPU: GeForce8800GT CPU: Intel Core2 Quad + OpenMP (4 cores)

Page 32: Softassign and EM-ICP on GPU

Summary

Implemented 3D registration algorithms on a GPU are: Softassign,

EM-ICP,

Weighted Horn’s method.

EM-ICP.GPU is able to align 5000 points within 7 seconds,

60 times faster than EM-ICP.CPU,

more robust than ICP.CPU.

Code, binary, and movies are available at: http://home.hiroshima-u.ac.jp/tamaki/study/cuda_softassign_emicp/

Page 33: Softassign and EM-ICP on GPU

Limitations

Number of points

Should be less than 8000 for GeForce8800GT with 512MB memory.

More memory, more points.

Stopping condition

requires to store whole matrix 𝑀 or 𝐴, and compare with previous ones: inefficient.

Hence, currently, number of iterations is fixed.

Page 34: Softassign and EM-ICP on GPU

Algorithms for registration

Horn’s method

• Corresponding point sets are given.

• Estimate R and t.

ICP (Iterative closest point)

• Unknown correspondence.

• Fast, standard.

• Easily fail due to local minimum.

• A lot of variants follow.

Softassign

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

EM-ICP

• Unknown correspondence.

• Robust.

• Very slow because of iterations.

Registration algorithm