1 Building Java Programs Chapter 5 Lecture 5-2: Random Numbers reading: 5.1, 5.6.
java random
-
Upload
katherine-jackson -
Category
Documents
-
view
22 -
download
3
description
Transcript of java random
![Page 1: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/1.jpg)
Session ID:
Session Classification:
Kai Michaelis, Chris Meyer, Jörg Schwenk
Horst Görtz Institute for IT-Security (HGI)
Chair for Network and Data Security
Ruhr-University Bochum, Germany
CRYP-W25
Advanced
Randomly Failed!
The State of Randomness in Current Java Implementations
![Page 2: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/2.jpg)
How Random are Java PRNGs?
Source: http://www.xkcd.com
![Page 3: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/3.jpg)
Spoiler Slide
Source: http://www.memegenerator.net
► At least, thankfully not!
► Inspected PRNGs
► Apache Harmony
► GNU Classpath
► OpenJDK
► The Legion of Bouncy Castle
► Methodology
► Code/algorithm inspection
► Blackbox Tests (Dieharder, STS, …)
► Broad range of code/algorithm quality
► The good
► The bad
► And the ugly
![Page 4: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/4.jpg)
Operation Sketch of a PRNG
Seed
Entropy Source
C
Entropy Source
B
Entropy Source
A
Seeding
(normally invoked only once)
Random number
Generating pseudo-random numbers
(invoked multiple times)
Focus of inspection
![Page 5: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/5.jpg)
► PRNG
► SHA-1 based algorithm: Seed, Counter, Internal State
► Backup seeding facility (Entropy Collector)
► (under Unix): Unix-Time, Processor Time, Pointer value (Heap)
► Weaknesses
► Self-seeding SecureRandom suffers from implementation bug
► Pointer into state buffer not properly adjusted
► Entropy reduced to 64 bits
► Directly affects the Android platform
► Backup seeding facility (Entropy Collector) suffers from multiple
implementation bugs
► MSB set to 0 (reason for this remains unclear)
► Inappropriate modular reduction (signed/unsigned integers)
► Entropy reduced to 31 bits (worst case)
Results: Apache Harmony
![Page 6: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/6.jpg)
Results: Apache Harmony
Quality of Entropy Collector
► Worst-Case scenario
► 2 consecutive bytes
mark a single point
► 10 MiB generated Seed
![Page 7: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/7.jpg)
► PRNG
► SHA-1 based algorithm: Seed, Internal State
► Backup seeding facility (Entropy Collector)
► 8 Threads increment independent counters (c1..c8)
► Seed = c1 XOR c2 XOR …. c8
► Weaknesses
► Repeated use of same IV
► Predictable internal states: only 20 of 32 bytes unknown
► Backup seeding facility (Entropy Collector) can be influenced
► High load prevents threads execution
Results:GNU Classpath
![Page 8: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/8.jpg)
Results: GNU Classpath
Quality of Entropy Collector
► Worst-Case scenario
► 2 consecutive bytes
mark a single point
► 10 MiB generated Seed
![Page 9: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/9.jpg)
► PRNG
► SHA-1 based algorithm: Seed, Internal State, Fixed-state
protection
► Backup seeding facility (Entropy Collector)
► Counter incrementation, System.properties
► Noise threads (keep the scheduler busy)
► S-Boxing counter
► Enforcing mandatory runtime and counter incrementation
► Slow….
► Weaknesses ► No obvious weakness
Results: OpenJDK
![Page 10: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/10.jpg)
► PRNG
► Multiple SecureRandom replacements
► SHA-1 based algorithm: Seed, Internal State, Counter
► VMPC based algorithm
► Backup seeding facility (Entropy Collector)
► Counter incrementation
► Producer and Consumer Threads
► Slow and fast operation mode
► Weaknesses
► VMPC known to be vulnerable to distinguishing attacks
Results: Bouncy Castle
![Page 11: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/11.jpg)
► PRNGs are only as good as the seed’s entropy
► Software Entropy Collectors are mostly not suitable for
cryptographic purposes
► Broad range of code/algorithm quality
► Fixed & limited size of internal states
► Some implementations are only susceptible to the
outlined vulnerabilities if no OS entropy is available
Conclusion
In critical environments ► Prevent usage of PRNGs
► Exclusively rely on hardware ECs/RNGs
![Page 12: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/12.jpg)
Random Questions?
Chris Meyer
http://armoredbarista.blogspot.com
http://www.nds.rub.de/chair/people/cmeyer
Source: http://www.troll.me
![Page 13: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/13.jpg)
Efficient Vector Implementations of AES-basedDesigns: A Case Study and New Implementations
for Grøstl
Severin Holzer-Graf, Thomas Krinninger, Martin Pernull,Martin Schlaffer1, Peter Schwabe2, David Seywald,
Wolfgang Wieser
IAIK, Graz University of Technology, [email protected]
Digital Security Group, Radboud University Nijmegen, NetherlandsResearch Center for Information Technology Innovation, Academia Sinica, Taiwan
CT-RSA 2013
![Page 14: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/14.jpg)
Contents
1 Motivation
2 Short Description of Grøstl
3 Storing the Grøstl State
4 New Grøstl Implementations
5 Conclusion
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 2 / 27
![Page 15: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/15.jpg)
Motivation
Outline
1 Motivation
2 Short Description of Grøstl
3 Storing the Grøstl State
4 New Grøstl Implementations
5 Conclusion
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 3 / 27
![Page 16: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/16.jpg)
Motivation
Motivation
Many AES-based hash functions have been submitted to theNIST SHA-3 competition [Nat07]
Fast using Intel AES-NI instructions, slow otherwise!?
More difficult to implement (or to improve performance)?
Learn lessons to improve new AES-based designs
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 4 / 27
![Page 17: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/17.jpg)
Motivation
AES-Based Hash Functions
AddRoundKeyk00
k10
k20
k30
k01
k11
k21
k31
k02
k12
k22
k32
k03
k13
k23
k33
SubBytes
S(x)
ShiftRows MixColumns
AES [Nat01]: 4x4 state of bytes (128-bit)
Advantages:very well analyzed building blocksproofs against linear and differential attackssimple analysis (wide trails)fast using AES instruction
Disadvantages:much larger state needed for 256-, 512-bit hash functionAES not so well analyzed in hash function settingwrong usage of AES (just 1 round, ...)slow without AES instruction, large tables, cache timing attacks
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 5 / 27
![Page 18: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/18.jpg)
Short Description of Grøstl
Outline
1 Motivation
2 Short Description of Grøstl
3 Storing the Grøstl State
4 New Grøstl Implementations
5 Conclusion
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 6 / 27
![Page 19: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/19.jpg)
Short Description of Grøstl
The SHA-3 Finalist Grøstl [GKM+11]
Hi−1 HiP
QMi`
` `
Permutation based designDouble-pipe compression function (` ≥ 2n)AES-based hash functionDesigned by DTU (Denmark) and TU Graz (Austria)Praveen Gauravaram, Lars R. Knudsen, Krystian Matusiewicz, FlorianMendel, Christian Rechberger, Martin Schlaffer, Søren S. Thomsen
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 7 / 27
![Page 20: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/20.jpg)
Short Description of Grøstl
Permutations P and Q of Grøstl
Q:
P :
0i 1i 2i 3i 4i 5i 6i 7i
AddRoundConstant (AC)
fi ei di ci bi ai 9i 8i
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
ff
S
SubBytes (SB)
S
ShiftBytes (SH) MixBytes (MB)
AES like round transformations8× 8 state and 10 rounds for Grøstl-2568× 16 state and 14 rounds for Grøstl-512
Differences between P/Q and Grøstl-256/Grøstl-512heavier round transformations are the same (SB,MB)lightweight round transformations differ (AC,SH)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 8 / 27
![Page 21: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/21.jpg)
Storing the Grøstl State
Outline
1 Motivation
2 Short Description of Grøstl
3 Storing the Grøstl State
4 New Grøstl Implementations
5 Conclusion
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 9 / 27
![Page 22: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/22.jpg)
Storing the Grøstl State
Ordering of the Grøstl State
P Q
Byte ordering (to avoid byte extractions, 8-bit implementation)
Column ordering (T-table approach, [DR99, Sect. 5.2])Row ordering (byteslice implementation, [ARSS11])Bitslicing (to avoid table lookups, [Bih97])
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 10 / 27
![Page 23: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/23.jpg)
Storing the Grøstl State
Ordering of the Grøstl State
P Q
mmx0
mmx0
mmx1
mmx1
mmx2
mmx2
mmx3
mmx3
mmx4
mmx4
mmx5
mmx5
mmx6
mmx6
mmx7
mmx7
Byte ordering (to avoid byte extractions, 8-bit implementation)Column ordering (T-table approach, [DR99, Sect. 5.2])
Row ordering (byteslice implementation, [ARSS11])Bitslicing (to avoid table lookups, [Bih97])
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 10 / 27
![Page 24: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/24.jpg)
Storing the Grøstl State
Ordering of the Grøstl State
P Q
xmm0
xmm1
xmm2
xmm3
xmm4
xmm5
xmm6
xmm7
Byte ordering (to avoid byte extractions, 8-bit implementation)Column ordering (T-table approach, [DR99, Sect. 5.2])Row ordering (byteslice implementation, [ARSS11])
Bitslicing (to avoid table lookups, [Bih97])
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 10 / 27
![Page 25: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/25.jpg)
Storing the Grøstl State
Ordering of the Grøstl State
P Q
xmm0
Byte ordering (to avoid byte extractions, 8-bit implementation)Column ordering (T-table approach, [DR99, Sect. 5.2])Row ordering (byteslice implementation, [ARSS11])Bitslicing (to avoid table lookups, [Bih97])
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 10 / 27
![Page 26: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/26.jpg)
Storing the Grøstl State
AddRoundConstant
P Q0i 1i 2i 3i 4i 5i 6i 7i
fi ei di ci bi ai 9i 8iffffffffffffff
ffffffffffffff
ffffffffffffff
ffffffffffffff
ffffffffffffff
ffffffffffffff
ffffffffffffff
ffffffffffffff
Description:round dependent row in P and Qfull constant 0xff in Q
Implementation:load and XOR constant to the stateimplement 0xff using inversion or negative S-box indexing
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 11 / 27
![Page 27: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/27.jpg)
Storing the Grøstl State
SubBytes
S
Description:substitute each byte using AES S-boxbased on inversion in finite field GF (28)
S(x) = A · x−1 + bImplementation:
8-bit table lookups (or T-tables)Intel AES new instructions (AESENCLAST), [GI10]using byte shufflings (vperm), [Ham09]compute using optimized formulas (bitslicing), [Can05]
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 12 / 27
![Page 28: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/28.jpg)
Storing the Grøstl State
ShiftBytes
P Q
Description:rotate (shuffle) the bytes of each rowdifferent values for P and Q
Implementation:byte addressing (byte ordering)byte extractions (column ordering)byte shufflings/rotations (row ordering)bit shufflings/rotations (bitslice)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 13 / 27
![Page 29: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/29.jpg)
Storing the Grøstl State
MixBytes
MixBytes (MB)
Definition:applied to 8-byte columns (input: ai , output: bi )
multiplication with constant MDS matrix in GF (28)
T-tables: 8 byte extractions, 8 lookups, 7 XORs (S-box included)compute using optimized formulas (8-bit, byteslice, bitslice)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 14 / 27
![Page 30: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/30.jpg)
Storing the Grøstl State
MixBytes
b0
b1
b2
b3
b4
b5
b6
b7
=
2 2 3 4 5 3 5 77 2 2 3 4 5 3 55 7 2 2 3 4 5 33 5 7 2 2 3 4 55 3 5 7 2 2 3 44 5 3 5 7 2 2 33 4 5 3 5 7 2 22 3 4 5 3 5 7 2
·
a0
a1
a2
a3
a4
a5
a6
a7
Definition:
applied to 8-byte columns (input: ai , output: bi )multiplication with constant MDS matrix in GF (28)
T-tables: 8 byte extractions, 8 lookups, 7 XORs (S-box included)compute using optimized formulas (8-bit, byteslice, bitslice)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 14 / 27
![Page 31: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/31.jpg)
Storing the Grøstl State
MixBytes
b0 = 2a0a0 ⊕ 2a1 ⊕ 3a2 ⊕ 4a3 ⊕ 5a4 ⊕ 3a5 ⊕ 5a6 ⊕ 7a7
b1 = 7a0a0 ⊕ 2a1 ⊕ 2a2 ⊕ 3a3 ⊕ 4a4 ⊕ 5a5 ⊕ 3a6 ⊕ 5a7
b2 = 5a0a0 ⊕ 7a1 ⊕ 2a2 ⊕ 2a3 ⊕ 3a4 ⊕ 4a5 ⊕ 5a6 ⊕ 3a7
b3 = 3a0a0 ⊕ 5a1 ⊕ 7a2 ⊕ 2a3 ⊕ 2a4 ⊕ 3a5 ⊕ 4a6 ⊕ 5a7
b4 = 5a0a0 ⊕ 3a1 ⊕ 5a2 ⊕ 7a3 ⊕ 2a4 ⊕ 2a5 ⊕ 3a6 ⊕ 4a7
b5 = 4a0a0 ⊕ 5a1 ⊕ 3a2 ⊕ 5a3 ⊕ 7a4 ⊕ 2a5 ⊕ 2a6 ⊕ 3a7
b6 = 3a0a0 ⊕ 4a1 ⊕ 5a2 ⊕ 3a3 ⊕ 5a4 ⊕ 7a5 ⊕ 2a6 ⊕ 2a7
b7 = 2a0a0 ⊕ 3a1 ⊕ 4a2 ⊕ 5a3 ⊕ 3a4 ⊕ 5a5 ⊕ 7a6 ⊕ 2a7
Definition:applied to 8-byte columns (input: ai , output: bi )multiplication with constant MDS matrix in GF (28)
Implementation:T-tables: 8 byte extractions, 8 lookups, 7 XORs (S-box included)
compute using optimized formulas (8-bit, byteslice, bitslice)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 14 / 27
![Page 32: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/32.jpg)
Storing the Grøstl State
MixBytes
bi = ai ⊕ ai+1,
ai = bi ⊕ ai+6,
ai = ai ⊕ bi+2,
bi = bi ⊕ bi+3,
bi = 02 · bi ,
bi = bi ⊕ ai+4,
bi = 02 · bi ,
ai = bi+3 ⊕ ai+4.
Definition:applied to 8-byte columns (input: ai , output: bi )multiplication with constant MDS matrix in GF (28)
Implementation:T-tables: 8 byte extractions, 8 lookups, 7 XORs (S-box included)compute using optimized formulas (8-bit, byteslice, bitslice)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 14 / 27
![Page 33: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/33.jpg)
New Grøstl Implementations
Outline
1 Motivation
2 Short Description of Grøstl
3 Storing the Grøstl State
4 New Grøstl Implementations
5 Conclusion
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 15 / 27
![Page 34: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/34.jpg)
New Grøstl Implementations
AVX2: Byteslicing Grøstl-512
P Q
ymm0
ymm1
ymm2
ymm3
ymm4
ymm5
ymm6
ymm7
Computing the whole Grøstl-512 state in parallel32 columns in parallel using 256-bit AVX2 instructionsOnly need to extract for 128-bit AESENCLAST instruction40% less instructions compared to AES-NI or AVX
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 16 / 27
![Page 35: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/35.jpg)
New Grøstl Implementations
AVX2: Parallel T-Table Lookups for Grøstl-256
P Q
ymm0
ymm1
ymm2
ymm3
Store 4 columns in one 256-bit AVX2 registerPerform 4 T-table lookups in parallel using VPGATHERQQShiftBytes more expensive: VPSHUFB, VPERMQ, VPBLENDD15% less instructions compared to 64-bit implementation
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 17 / 27
![Page 36: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/36.jpg)
New Grøstl Implementations
NEON: Alternating T-Table Lookups for P and Q
P Q
q0
q1
q2
q3
q4
q5
q6
q7
q8
q9
q10
q11
q12
q13
q14
q15
Make use of 64-bit NEON loads (VLD1.64)Still need single byte ARM loads (LDRB) and address computation20 cycle penalty when moving data between ARM and NEONAvoid by interleaving computation of P and Q45.8 cycles/byte (previously: 76.9)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 18 / 27
![Page 37: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/37.jpg)
New Grøstl Implementations
NEON: Alternating T-Table Lookups for P and Q
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 19 / 27
![Page 38: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/38.jpg)
New Grøstl Implementations
NEON: Bitslice Implementation using VSHL and VEXT
Store the 8 bits of each byte in 8 separate 128-bit registers (P‖Q)Efficiency strongly depends on arrangement of bits
combine columns in bytes
combine rows in bytes (more efficient using NEON)ShiftBytes: variable rotation of bits within bytes (VSHL)MixBytes: to XOR rows, we first rotate bytes of registers (VEXT)48.5 cycles/byte (similar to table based)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 20 / 27
![Page 39: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/39.jpg)
New Grøstl Implementations
NEON: Bitslice Implementation using VSHL and VEXT
Store the 8 bits of each byte in 8 separate 128-bit registers (P‖Q)Efficiency strongly depends on arrangement of bits
combine columns in bytescombine rows in bytes (more efficient using NEON)
ShiftBytes: variable rotation of bits within bytes (VSHL)MixBytes: to XOR rows, we first rotate bytes of registers (VEXT)48.5 cycles/byte (similar to table based)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 20 / 27
![Page 40: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/40.jpg)
New Grøstl Implementations
NEON: Bitslice Implementation using VSHL and VEXT
Store the 8 bits of each byte in 8 separate 128-bit registers (P‖Q)Efficiency strongly depends on arrangement of bits
combine columns in bytescombine rows in bytes (more efficient using NEON)
ShiftBytes: variable rotation of bits within bytes (VSHL)MixBytes: to XOR rows, we first rotate bytes of registers (VEXT)
48.5 cycles/byte (similar to table based)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 20 / 27
![Page 41: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/41.jpg)
New Grøstl Implementations
NEON: Bitslice Implementation using VSHL and VEXT
Store the 8 bits of each byte in 8 separate 128-bit registers (P‖Q)Efficiency strongly depends on arrangement of bits
combine columns in bytescombine rows in bytes (more efficient using NEON)
ShiftBytes: variable rotation of bits within bytes (VSHL)MixBytes: to XOR rows, we first rotate bytes of registers (VEXT)48.5 cycles/byte (similar to table based)
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 20 / 27
![Page 42: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/42.jpg)
New Grøstl Implementations
NEON: Bitslice Implementation using VSHL and VEXT
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 21 / 27
![Page 43: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/43.jpg)
New Grøstl Implementations
NEON: Vperm Implementation
P Q
xmm0
xmm1
xmm2
xmm3
xmm4
xmm5
xmm6
xmm7
Byteslice implementation using optimized MixBytes formulasComputing S-box using vperm approach
relatively expensive using NEONimprovements possible by optimizing dependency chainsbase point for AES instruction implementation (ARM8)
92.0 cycles/byte
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 22 / 27
![Page 44: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/44.jpg)
New Grøstl Implementations
Cortex-M0: Low Memory Vector Implementation
speed RAM ROM 4 · RAM + ROM[cycles/byte] [Bytes] [Bytes] [Bytes]
bytesliced (fast) 469 344 1948 3324bytesliced (small) 801 304 1464 2680T-table (2kB) 406 704 6952 9768T-table (8kB) 383 508 12630 14662(sphlib) 856 792 15184 18352(8bit-c) 1443 632 2796 5324(armcryptolib) 17496 400 1260 2860
Many different improved implementations (T-table, byteslice)Best results using 32-bit byteslicing
only S-box tables needed (no large T-tables)almost the speed of T-table implementation
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 23 / 27
![Page 45: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/45.jpg)
Conclusion
Outline
1 Motivation
2 Short Description of Grøstl
3 Storing the Grøstl State
4 New Grøstl Implementations
5 Conclusion
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 24 / 27
![Page 46: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/46.jpg)
Conclusion
Conclusion
Many new implementations of AES-based hash function Grøstl
2 AVX2 implementations3 NEON implementations4 low-mem Cortex-M0 implementations
All implementations with significant improvements
Ideas are applicable to any AES-based design
Use results to avoid bottlenecks in new AES-based designs
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 25 / 27
![Page 47: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/47.jpg)
References I
Kazumaro Aoki, Gunther Roland, Yu Sasaki, and Martin Schlaffer.Byte Slicing Grøstl – Optimized Intel AES-NI and 8-bit Implementations of the SHA-3Finalist Grøstl.In Javier Lopez and Pierangela Samarati, editors, SECRYPT 2011, Proceedings, pages124–133. SciTePress, 2011.
Eli Biham.A fast new DES implementation in software.In Eli Biham, editor, Fast Software Encryption, volume 1267 of LNCS, pages 260–272.Springer, 1997.http:
//www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/1997/CS/CS0891.pdf.
David Canright.A Very Compact S-Box for AES.In Josyula R. Rao and Berk Sunar, editors, CHES, volume 3659 of LNCS, pages 441–455.Springer, 2005.
Joan Daemen and Vincent Rijmen.AES Proposal: Rijndael.NIST AES Algorithm Submission, September 1999.Available online:http://csrc.nist.gov/archive/aes/rijndael/Rijndael-ammended.pdf.
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 26 / 27
![Page 48: java random](https://reader034.fdocuments.in/reader034/viewer/2022052308/55cf9cad550346d033aaa90c/html5/thumbnails/48.jpg)
References II
Shay Gueron and Intel Corp.Intel® Advanced Encryption Standard (AES) Instructions Set, 2010.Retrieved December 21, 2010, from http://software.intel.com/en-us/articles/
intel-advanced-encryption-standard-aes-instructions-set/.
Praveen Gauravaram, Lars R. Knudsen, Krystian Matusiewicz, Florian Mendel, ChristianRechberger, Martin Schlaffer, and Søren S. Thomsen.Grøstl – a SHA-3 candidate.Submission to NIST (Round 3), 2011.Available: http://www.groestl.info (2011/11/25).
Mike Hamburg.Accelerating AES with Vector Permute Instructions.In Christophe Clavier and Kris Gaj, editors, CHES, volume 5747 of LNCS, pages 18–32.Springer, 2009.
National Institute of Standards and Technology.FIPS PUB 197, Advanced Encryption Standard (AES).Federal Information Processing Standards Publication 197, U.S. Department of Commerce,November 2001.
National Institute of Standards and Technology.Cryptographic Hash Project, 2007.Available online at http://www.nist.gov/hash-competition.
Martin Schlaffer (CT-RSA 2013) Vector Implementations of AES Designs 27 / 27