extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds...

32
Extreme Enumeration G di Cl d on GPU andin Clouds PoChun Kuo 1 , Michael Schneider 2 , Özgür Dagdelen 3 , Jan Reichelt 3 , Johannes Buchmann 2,3 , ChenMou Cheng 1 , and BoYin Yang 4 1 National Taiwan University, Taipei, Taiwan 2 Technische Universität Darmstadt, Germany 3 Center for Advanced Security Research Darmstadt (CASED), Germany 4 Academia Sinica, Taipei, Taiwan CHES Nara Japan Nara, Japan 2011.9.30

Transcript of extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds...

Page 1: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Extreme Enumeration G d i Cl don GPU and in Clouds

Po‐Chun Kuo1, Michael Schneider2, Özgür Dagdelen3, Jan Reichelt3,Johannes Buchmann2,3, Chen‐Mou Cheng1, and Bo‐Yin Yang4

1 National Taiwan University, Taipei, Taiwan2 Technische Universität Darmstadt, Germany

3 Center for Advanced Security Research Darmstadt (CASED), Germany4 Academia Sinica, Taipei, Taiwan

CHESNara JapanNara, Japan2011.9.30

Page 2: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

OutlineOutline

• Introduction–Problem Description–Problem Description–Known algorithm–Application

• Algorithm• Algorithm• ImplementationImplementation• Results

2

Page 3: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Problem Description

nn

OFind it!

3

Page 4: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Problem PropertyProblem Property

• NP‐hard• Worst/average case equivalence property/ g q p p y

– at 1997, Ajtai shown the equivalence between worst‐case and average‐case lattice problemworst case and average case lattice problem [Ajtai’97]

• Many problems could be reduced to SVP• Many problems could be reduced  to SVP– Knapsack, Subset Sum, factoring, approximate GCD bl tGCD problem…etc

• The first Fully Homomorphic Encryption Scheme is based on lattice problem is proposed at 2009 [Gentry’09] 4

Page 5: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Known AlgorithmKnown Algorithm• Approximation Algorithm

– Lattice Basis Reduction Algorithm– LLL, ratio ~ 1.02n in average case [Nguyen and Stehlé, 2006], g [ g y , ]– BKZ, ratio ~1.01n in average case

• with a parameter “block size”p– the approximate ratio are all exponential to input dimension

• Exact AlgorithmExact Algorithm– Lattice Enumeration

• Super exponential time and polynomial space• Super‐exponential time and polynomial space• But seems exponential time in practical

Si– Sieve• both exponential time and space 5

Page 6: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

ApplicationApplication• Factoring polynomials over the integers or the rational• Factoring polynomials over the integers or the rational 

numbers– E g given x2‐1 return x‐1 x+1E.g. given x 1,  return x 1, x+1

• Finding the minimal polynomial of an algebraic number given to a good enough approximation. g g pp– E.g, given 1.618033, return x2‐x+1=0

• Factoring number with known some bitFactoring number with known some bit• Approximate GCD problem• Integer ProgrammingInteger Programming• Knapsack problem• Subset Sum problemSubset Sum problem

6

Page 7: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Our ContributionsOur Contributions• Paralleli e and implement GNR’10 t i l tti• Parallelize and implement GNR’10 extreme pruning  lattice 

reduction on GPU and in Cloud– Highly parallelize, about 90% parallel benefit

O GTX 480 i b 12 i f h i7– One GTX 480 is about 12 times faster than one i7 core.– Extend the implementation to multiple GPUs and run on Amazon’s EC2 

cloud services.d h i id i d fl ibl b di• Extend the pruning idea to pre‐computing and flexible bounding 

function– Get the same output quality and speed up more than 10x in pre‐

ticomputing• New security level measure

– Tradition: Dollar‐day (e.g. buy 10 machines and run 5 days)– New: Dollar (e.g. rent computation power from Amazon)

• Estimate the cost of solving ASVP instances of SVP Challenge in higher dimensionsg

7

Page 8: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

AlgorithmAlgorithm

• Lattice Enumeration• Lattice Enumeration using Extreme Pruning

8

Page 9: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

AlgorithmAlgorithm• Lattice Enumeration

– Exhaustive search all the possible solutions

Gram‐SchmidtGram‐Schmidt Process

9

Page 10: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

• Build the search tree according to the “coefficient vector”• Guess bound = ||b1|| or Gauss prediction• Run DFS to find the shortest vector

……

10

Page 11: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

bound boundnew

Guess boundcn

new

Guess bound Early abort

cn‐1

Renew bound Gauss predictionp

Extreme pruning[GNR’10]

……

pruning[GNR 10] Parallel

c2

c111

Page 12: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

AlgorithmAlgorithm

• Lattice Enumeration• Lattice Enumeration using Extreme Pruning

12

Page 13: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Idea of Extreme PruningIdea of Extreme Pruning• Proposed by Nicolas Gama, Phong Q. Nguyen and  Oded Regev• Goal: Maximize the reward per operation• Only search the space where is the most possible to find out  

solution• If failed, randomized the basis and search again.

13

Page 14: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Lattice Enumeration using Extreme Pruning

• Guess very tight bound for each level

boundn cnfor each level

• If failed, do a randomize to the basis then do

boundn‐1 cn‐1to the basis, then do Enumeration again.

• E.g.E.g.time for search each tree be 1000 times faster, and 

…… ……

probability of finding out the vector be 10%,=> gain ~100x faster=>  gain ~100x faster

bound2 c2

bound1 c114

Page 15: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

How to decide bound for each level ?How to decide bound for each level ?

B di t (R R R ) [0 1]n• Bounding vector (R1, R2, …, Rn)     [0,1]n, with R1 ≦ R2 ≦ … ≦ Rn

• Bounding function for level  i is  Ri.Guess Boundg i

• Linear bounding functionR = i / n– Ri = i / n

– Theoretical analysis ‐> the success probability 1/n [GNR’10]– In Practice                  ‐> the success probability is much higher– But the search space is still too large for high dimension…

• Polynomial: fitting the numerically optimized in [GNR’10]• Polynomial: fitting the numerically optimized in [GNR 10]– No theoretical guarantee on the probability– But it performs well in practice– success probability is about 10% by experiment

15

Page 16: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Bounding functionBounding function1

0 7

0.8

0.9

Linear 

0.5

0.6

0.7

0.3

0.4Polynomial

0.1

0.2

00 5 10 15 20 25 30 35 40 45

16

Page 17: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

ll lParallel• Parallel for Enumeration

– Parallel for one search tree• Parallel for Extreme Enumeration

– Parallel formany search treesParallel for many search trees

17

Page 18: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Parallel for EnumerationParallel for Enumeration

Parallel  DFSupper tree + DFSlower tree

DFSDFS

pre‐computing

DFSupper tree one computer  pre computing pre‐computing

DFSlower tree  Parallel part

…Parallel part

Output the shortest vector  …DFS

Parallel computing

collecting the sub‐tree results.

18

Page 19: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Parallel for Extreme EnumerationInput basis

…Randomized Basis with Seedi

Pre‐ComputationLLL, BKZ

Pre‐ComputationLLL, BKZ

Pre‐ComputationLLL, BKZ

Enumeration Enumeration Enumeration…

Output

19

Page 20: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

ImplementationImplementation

h l d d i• We have Cloud and GPU version implementations.

• Cloud version: C++ and Hadoop Streaming• GPU version : CUDAGPU version   : CUDA

• Memory Used:– For dimension n, needs 8n2+40n+164 bytes– n=100, needs    84  KB– n=800, needs   5.1 MB,

20

Page 21: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Implementation overviewpInput basis

R d i d B i ith S d…

Pre‐ComputationLLL, BKZ

Pre‐ComputationLLL, BKZ

Randomized Basis with Seedi

CPU Cloud Computing

GPU ENUMGPU ENUMGPU ENUM

Cl dCloudCloud ENUM

Cloud ENUM

GPU ENUMGPU ENUMGPU ENUM Cloud ENUM

Cloud ENUM

ENUM

CPU Cloud ComputingGPU Cloud Computing

Output

CPU Cloud ComputingGPU Cloud Computing

– One GTX 480 is about 12 times faster than one i7 core. 21

Page 22: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

R ltResults• We launch our CPU Cloud in IIS, Taiwan

– Machine 0‐1  Intel Xeon E5430  • 2.66 (GHz) * 2(cpus) * 4(cores) * 1(thread)

– Machine 2‐3  Intel Xeon E5520  2 27 (GH ) * 2( ) * 4( ) * 2(th d )• 2.27 (GHz) * 2(cpus) * 4(cores) * 2(threads)

– Machine 4‐8  Intel Xeon E5620  • 2 40 (GHz) * 2(cpus) * 4(cores) * 2(threads)• 2.40 (GHz)   2(cpus)   4(cores)   2(threads)

– Total: 9 d 72 h i l 128 i t l• 9 nodes, 72 physical cores, 128 virtual cores, 

• 306 GHz

22

Page 23: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Performance compare to single‐corep g

100000

Cloud Enum Performance with pre‐Computing BKZ40P15

10000

100000

1000

Time

56

80x

100

Cost 

43x56x

1

10

Single‐coreGiantCloud (120X Frequency)

S d ti i i

182 83 84 85 86 87 88 89 90 91

Dimension

• Speed‐up ratio is increase23

Page 24: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Prune BKZ in Pre ComputingPrune BKZ in Pre‐Computing

0.9

18.00E‐01

9.00E‐01BKZ 30 Pruning 15

0.6

0.7

0.8

obab

ility

LLL 5.00E‐01

6.00E‐01

7.00E‐01

(sec)

BKZ 30

0 2

0.3

0.4

0.5

Success P

ro BKZ 10 Pruning 15BKZ 10BKZ 20 Pruning 15BKZ 20BKZ 30 Pruning 15

2.00E‐01

3.00E‐01

4.00E‐01

time (

0

0.1

0.2

0 10 20 30 40 50Dimension

BKZ 30 Pruning 15BKZ 30BKZ 40 Pruning 15

0.00E+00

1.00E‐01

30 35 40Dimension

• Almost the same quality as none‐pruning version

Dimension

• But Cost time is much less – in dimension 80 ‐120, speed‐up 10x faster 24

Page 25: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Total running timeTotal running time100000

10000

3540

1000

Time (secon

d)

BKZ 30 P i 15BKZ bl k i30

100

BKZ 30 Pruning 15

BKZ 35 Pruning 15

BKZ block size

1080 85 90 95 100 105 110

Di i

BKZ 40 Pruning 15

• Seems exponential increaseT d ff th ti (b i lit ) d

Dimension

• Tradeoff on the pre‐computing (basis quality) and extreme enumeration

25

Page 26: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

BKZ / (BKZ+ ENUM)BKZ / (BKZ+ ENUM)1

0 7

0.8

0.9

0.5

0.6

0.7

Ratio

0.3

0.4

R

BKZ 30 Pruning 15

BKZ 35 Pruning 15

0

0.1

0.2

BKZ 40 Pruning 15

• To combine with “total running time” when percentage BKZ of

80 85 90 95 100 105 110

Dimension

• To combine with  total running time ,  when percentage BKZ of total time is about 40‐60%, it gets the min total running time

26

Page 27: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Estimate Cost Time

10000000

100000000

1E+09

10000

100000

1000000

100

1000

10000

ost T

ime (day)

0.1

1

10

100 105 110 115 120 125 130

Co

0.001

0.01BKZ 35 Pruning 15

BKZ 40 Pruning 15

BKZ 55 Pruning 15

• This shows the optimal pre‐computing is BKZ d

0.0001Dimension

BKZ 55 Pruning 15

40 pruning 15 in dimension 104 to130.27

Page 28: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

http://www.latticechallenge.org/svp‐challenge/28

Page 29: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Our RecordOur RecordN 1 di i 120 d 0• No. 1, dimension 120, seed 0– 2300 US dollars paid to Amazon

• Rent 64 machines about 14 4 hours• Rent 64 machines about 14.4 hours– Machine type

• Cluster GPU Quadruple Extra Large InstanceCluster GPU Quadruple Extra Large Instance • 2 x Intel Xeon X5570 CPUs• 22 GB of memory• 2 x NVIDIA Tesla “Fermi” M2050 GPUs• 1690 GB of instance storage

t 2 5 US d ll h• costs 2.5 US dollars per hour• http://aws.amazon.com/ec2/instance‐types/

29

Page 30: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

Concluding RemarksConcluding Remarks 

i i l lid i f ’• Empirical validation of GNR’10 extreme pruning 

• Parallelize and implement GNR’10 extreme pruning on GPU and in Cloudp g

• Extend the pruning idea to pre‐computing and flexible bounding functionand flexible bounding function

• New security level measure• Estimate the cost of solving ASVP instances of SVP Challenge in higher dimensions

30

Page 31: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

• Thank you

31

Page 32: extreme enumeration on GPU and in cloud ches...Extreme Enumeration on GPU and in Cl dClouds Po‐Chun Kuo1, Michael 3Schneider2, Özgür Dagdelen3, Jan Reichelt , Johannes Buchmann2,3,

•Q & A ?•Q & A ?

32