NIST SHA-3 ASIC Datasheet - Virginia Techrijndael.ece.vt.edu/sha3/chip/sha3-asic-datasheet.pdf ·...
Transcript of NIST SHA-3 ASIC Datasheet - Virginia Techrijndael.ece.vt.edu/sha3/chip/sha3-asic-datasheet.pdf ·...
© 2011 Center for Embedded Systems for Critical Applications
1 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
NIST SHA-3 ASIC Datasheet
-- NIST SHA-3 Competition Five Finalists on a Chip
(Version 1.1)
Project Sponsor: National Institute of Standards and Technology (NIST)
Contact: [email protected]
Technology: IBM MOSIS 0.13µm CMR8SF-RVT
Standard-Cell Library: ARM’s Artisan SAGE-X V2.0
Area: 5mm2 (Core: 1.656mm x 1.656mm)
Package: 160-pin 28mmx28mm QFP Open Cavity
Voltage: 3.3V I/O and 1.2V Core
Hash Modules: BLAKE-256, GrØ stl-256, JH-256,
Keccak-256, Skein-256, and SHA256 (reference)
Compatible Test Platform: SASEBO-R Board
© 2011 Center for Embedded Systems for Critical Applications
2 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
I Overview
This chip was developed as part of the NIST sponsored project, ‘Environment for Fair and
Comprehensive Performance Evaluation of Cryptographic Hardware and Software’. The SHA-3
ASIC is manufactured using IBM MOSIS 0.13µm CMR8SF-RVT process with ARM’s Artisan SAGE-X
V2.0 standard-cell library. It contains all the SHA-3 five finalists, BLAKE-256, GrØ stl-256, JH-256,
Keccak-256, Skein-256 using the latest Round 3 tweaks (updated in January 2011) and a
reference SHA256. The SHA-3 ASIC is packaged with 160-pin QFP and designed to be compatible
with the SASEBO-R platform.
Fig. 1 The SHA-3 ASIC mounted on the 160-pin QFP Socket on SASEBO-R platform
SHA-3
ASIC@VT
© 2011 Center for Embedded Systems for Critical Applications
3 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
II I/O Assignments
Table 1 List of I/O signals
Type Signal Name # of Pins
Active H/L
Direction Descriptions
System (3)
fClk 1 -- IN Clock input for intra-chip core module. Must be at least 2 times higher than chip interface sClk frequency.
sClk 1 -- IN Clock input for chip interface logic.
rst_n 1 Low IN Rest signal generated by on-board reset circuit. Asynchronous reset input.
Bus Control
(9)
init 1 -- IN Initialize the internal states and parameters for hash operations.
load 1 -- IN Load input data enable signal.
fetch 1 -- IN Fetch output data enable signal.
algsel 4 -- IN Select the sub-design under test.
mode 1 -- IN Select the mode of testing.
ack 1 -- OUT Load and fetch acknowledge signal.
Bus Data (32)
idata 16 -- IN Input data.
odata 16 -- OUT Output data.
Test Pins (6)
blake_busy 1 HIGH OUT Indicate the Blake core hashing period.
groestl_busy 1 HIGH OUT Indicate the Groestl core hashing period.
jh_busy 1 HIGH OUT Indicate the JH core hashing period.
keccak_busy 1 HIGH OUT Indicate the Keccak core hashing period.
skein_busy 1 HIGH OUT Indicate the Skein core hashing period.
sha256_busy 1 HIGH OUT Indicate the SHA256 core hashing period.
Clock Pins (8)
osc_sel 2 -- IN Select which ring oscillator to use
divider_sel 4 -- IN Select divisor of the clk generator
pbias 1 Analog IN Bias the current mirror that controls the rate of oscillation of the clock generator (range from 0.0V to 0.8V)
clk_out 1 -- OUT Gives a representation of the clock that is used in the circuit
Total 58
© 2011 Center for Embedded Systems for Critical Applications
4 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
123456789
10111213141516171819202122232425262728293031323334353637383940
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81828384858687888990919293949596979899
100101102103104105106107108109110111112113114115116117118119120
12
11
22
12
31
24
12
51
26
12
71
28
12
91
30
13
11
32
13
31
34
13
51
36
13
71
38
13
91
40
14
11
42
14
31
44
14
51
46
14
71
48
14
91
50
15
11
52
15
31
54
15
51
56
15
71
58
15
91
60
SHA-3 ASIC@VT
NCNC
123456789
101112131415161718192021 2
22
32
42
52
62
72
82
93
03
13
23
33
43
53
63
73
83
94
04
14
2
434445464748495051525354555657585960616263
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
Fig. 2 The SHA-3 ASIC chip bonding diagram for 160-pin QFP package. (Note: The top left is pin
number #1 with two extra unconnected (NC) pins and pins are counted counter-clockwise)
Table 2 I/O Pin Assignment of the 160-Pin QFP package
Package Pin No.
Die Pad No.
Signal Name I/O
Direction Function
1 1 PVSS1DGZ core GND
2 NC
3 NC
4 2 SWIN[3] IN
Divider select 5 3 SWIN[2] IN
6 4 SWIN[1] IN
7 5 SWIN[0] IN
8 NC
9 6 PHIN[1] IN Oscillator select
© 2011 Center for Embedded Systems for Critical Applications
5 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
10 7 PHIN[0] IN
11 NC
12 NC
13 NC
14 NC
15 NC
16 NC
17 NC
18 NC
19 NC
20 8 PVDD1DGZ core 1.2V
21 9 PVSS1DGZ core GND
22 NC
23 NC
24 NC
25 NC
26 10 keccak_busy OUT Keccak core hash busy.
27 11 skein_busy OUT Skein core hash busy.
28 NC
29 12 init IN Initialize hash module.
30 13 load IN Input data load signal.
31 14 fetch IN Output data fetch signal.
32 15 mode IN Select the hash mode.
33 NC
34 16
ack OUT Load and fetch acknowledge signal.
35 17 blake_busy OUT Blake core hashing busy.
36 18 groestl_busy OUT Groestl core hashing busy.
37 19 jh_busy OUT JH core hashing busy.
38
39 20 PVSS1DGZ core GND
40 21 PVSS1DGZ core GND
41 NC
42 NC
43 NC
44 NC
45 22 sha256_busy OUT SHA256 core hash busy.
46 NC
47 NC
48 23 PVDD1DGZ core 1.2V
49 24 Clk_Switch IN
Select the design under test. 50 25 algsel[2] IN
51 26 algsel[1] IN
52 27 algsel[0] IN
53 NC
© 2011 Center for Embedded Systems for Critical Applications
6 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
54 NC
55 28 PVSS2DGZ I/O GND
56 29 sClk IN Slow interface clock.
57 30 PVSS2DGZ I/O GND
58 31 fClk IN Fast clock for chip core.
59 32 PVSS2DGZ I/O GND
60 33 PVDD1DGZ core 1.2V
61 NC
62 NC
63 34 rst_n IN Reset chip.
64 NC
65 NC
66 NC
67 NC
68 NC
69 35 odata[15] OUT
Output data. 70 36 odata[14] OUT
71 37 odata[13] OUT
72 38 odata[12] OUT
73 NC
74 NC
75 39 odata[11] OUT
Output data. 76 40 odata[10] OUT
77 41 odata[9] OUT
78 42 odata[8] OUT
79 NC
80 NC
81 NC
82 NC
83 NC
84 43 odata[7] OUT
Output data. 85 44 odata[6] OUT
86 45 odata[5] OUT
87 46 odata[4] OUT
88 47 PVDD2DGZ I/O 3.3V
89 48 odata[3] OUT
Output data. 90 49 odata[2] OUT
91 50 odata[1] OUT
92 51 odata[0] OUT
93 NC
94 NC
95 NC
96 NC
97 NC
98 NC
© 2011 Center for Embedded Systems for Critical Applications
7 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
99 52 PVDD2DGZ I/O 3.3V
100 53 PVDD1DGZ core 1.2V
101 54 PVSS1DGZ core GND
102 NC
103 NC
104 NC
105 NC
106 NC
107 NC
108 NC
109 55 idata[0] IN
Input data. 110 56 idata[1] IN
111 57 idata[2] IN
112 58 idata[3] IN
113 NC
114 59 idata[4] IN
Input data. 115 60 idata[5] IN
116 61 idata[6] IN
117 62 idata[7] IN
118 NC
119 NC
120 63 PVSS1DGZ core GND
121 64 PVDD1DGZ core 1.2V
122 65 PVDD2DGZ I/O 3.3V
123 66 idata[8] IN
Input data. 124 67 idata[9] IN
125 68 idata[10] IN
126 69 idata[11] IN
127 NC
128 70 PVDD1DGZ core 1.2V
129 71 idata[12] IN
Input data. 130 72 idata[13] IN
131 73 idata[14] IN
132 74 idata[15] IN
133 75 PVDD2DGZ I/O 3.3V
134 76 PVSS1DGZ core GND
135 NC
136 NC
137 NC
138 NC
139 NC
140 77 PVDD1DGZ core 1.2V
141 78 PVSS1DGZ core GND
142 79 PVSS2DGZ I/O GND
143 NC
© 2011 Center for Embedded Systems for Critical Applications
8 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
144 NC
145 NC
146 NC
147 NC
148 80 PVSS1DGZ core GND
149 NC
150 81 Clk_out OUT Monitor the clock
151 82 Pbias IN Rate of oscillation control
152 NC
153 NC
154 83 PVDD1DGZ core 1.2V
155 NC
156 NC
157 NC
158 NC
159 NC
160 84 PVDD1DGZ core 1.2V
© 2011 Center for Embedded Systems for Critical Applications
9 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
III Design Notes
sCLK
RST
sLOAD
sFETCH
sDIN
sINIT
EXT_fClk
AlgSel
sMode
fCLK_0CLK
Gating
AlgSel
sACK
sDOUT
Gated
Inputs
fCLK_5
Ring-Oscillators
CLK Divider
VCO
Clk_Switch
VCO_CLK_Sel
CLK_DIV_Sel
4
2
3
16
sCLK_0
RST
sLOAD_0
sFETCH_0
sACK_0
sDIN_0
sDOUT_0
sINIT_0
sMode_0
16 16
slow
to
fast
sync
fast
to
slow
sync
JH
~250MHz
fCLK_0
Gated
Outputs
16
3
SHA-3 ASIC
CLK_OUT
Clock ManagementJH
~250MHz
Keccak
~250MHz
Grostl
~200MHz
SHA256
~200MHz
BLAKE
~125MHz
Skein
~125MHz
Fig. 3 The block diagram of SHA-3 ASIC
A. Clock Management
For signal integrity issues associated with the on-board wire transfers, the ASIC interface clock
used to synchronize all the data and control signals from/to the control FPGA should only run at
relatively low frequency. The SHA-3 ASIC chip can operate at 250 MHz for some candidates, so
additional stable fast clock is required, which will be gated and shared by all the hash modules.
Clock Generation. For high frequency testing purpose, an on-chip clock generation module is
integrated. We used the custom-cell design approach to integrate a ring oscillator (RO) based
voltage-controlled oscillator (VCO) into the chip. The VCO takes four input ports PBIAS, VCO-EN-
7, 9, 11, and three output frequencies, VCO-RO-7, 9, 11. The PBIAS voltage can be varied to
produce a range of clock frequencies. The voltage can be varied from 0V to 0.8V. The ‘EN’ is
used to turn on/off the clock outputs of VCO. VCO-RO-7, 9, 11 are the three frequencies
produced by the block for any particular PBIAS voltage. FREQ-7 is the clock from a 7-stage RO.
The clock generation module also includes standard cell based ring-oscillators, and together
with the VCO clocks and a standard cell based clock divider they can support a wide range of
clock frequencies to fill our need for performance testing. Fig. 5 shows averaged measurements
of on-chip clock speed for a batch of 10 fabricated chips.
© 2011 Center for Embedded Systems for Critical Applications
10 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
PBIAS
NBIAS
VDD
VDD
NBIAS
PBIAS
IN OUT
Current Mirror (CM) Current Starved Inverter (CSI)
CMPBIAS NBIAS
CSICSICSICSI CSICSI
VCO-EN-7
VCO-RO-7
Fig. 4 The diagram of a 7-stage VCO-RO clock design
Fig. 5 The range of on-chip generated clock frequencies
Clock Configurations. The on-chip generated clocks are also MUXed with the external fast clock
input and can be configured through dedicated ports. The external fast clock can be fed through
a SMA connector from a signal generator or it can be provided through the control FPGA. Clock
gating is implemented to guarantee that only one hash module is enabled at a time.
B. Chip Interface
Standard Hash Interface. As shown in Fig. 3, the chip interface adopted the standard hash
interface proposal by Chen et al. [1] and extended it to add mode selection and dual-clock
support [5,6].
© 2011 Center for Embedded Systems for Critical Applications
11 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
Clock Domain Crossing (CDC) Synchronizer. There are two clock domains in the chip: the slow
one is for the interfacing logic and the fast one is for hash modules. The clock synchronizer
design requires that the internal hash clock be at least two times faster than the interface clock.
The area numbers reported for each hash core include the overhead of the synchronizer module.
C. SHA-3 Finalists Implementations
Table 3 summarizes the major implementation aspects of each SHA-3 finalist. The design
decisions are made to achieve the primary optimization for Throughput-to-Area ratio. For details
on the SHA-3 candidates please refer to the related specification documents on NIST SHA-3 web
sites. For hardware architectures, we have looked into several public available reference
implementations [2]–[4] and optimized them for our system architecture. Implementation
details for each hash module can be consulted in [5].
Table 3 The summary of design specifications of SHA-3 finalists
Algorithm Implementation Descriptions
BLAKE-256 4 parallel G functions; 1-stage pipeline in permutation
Grøstl-256 Parallel P and Q with 128 GF-based AES SBoxes
JH-256 SBoxes S0 and S1 are implemented in LUT
Keccak-256 One clock cycle per round
Skein512-256 Unrolled 4 Threefish rounds
D. SHA-3 Configurations
(1) Select Design Under Test (DUT):
AlgSel[2:0] ASIC Pins Functions Notes
000 [50,51,52] Enable BLAKE-256
Only one hash module is enabled each time. When one hash module is enabled, the inputs/outputs and clock will be gated for all the rest candidates.
001 [50,51,52] Enable GrØ stl-256
010 [50,51,52] Enable JH-256
011 [50,51,52] Enable Keccak-256
100 [50,51,52] Enable Skein-256
101 [50,51,52] Enable SHA256
(2) Select clock sources:
Clk_Switch ASIC Pins Functions Notes
0 49 External clock This can be clock from FPGA-DCM or external clock from SMA (J6) socket.
1 49 Internal clock This can be clock from internal standard-cell RO based clock or VCO based clock.
© 2011 Center for Embedded Systems for Critical Applications
12 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
(3) Select the SHA-3 ASIC internal clock resources:
VCO_CLK_Sel [1:0]
ASIC Pins Functions Notes
00,01,10 [6,7] Select VCO clock Select from 3 custom-cell ROs with different stages (7, 9, and 11).
11 [6,7] Select standard-
cell RO clock Select from 3 standard-cell ROs with different stages (25, 33, and 43).
(4) Select the clock division ratios:
Custom-cell VCO Clock
DIV Select ASIC Pins Notes
‘0000~1111’
[4,5,6,7]
C0 VCO CLK: (CLK_SEL=00) C1 VCO CLK: (CLK_SEL=01) C2 VCO CLK: (CLK_SEL=10) Divisions: 1,2,3,4,6,8,12,16,24,32,48,64,96,128,192,384
Standard-cell RO Clock
DIV Select ASIC Pins Notes
1000 1001 1010
1011+ 0000
[4,5,6,7] S0 SC-RO CLK Divisions: 1,2,4,8,128
0100 0101 0110
0111+ 0001
[4,5,6,7] S1 SC-RO CLK Divisions: 1,2,4,8,128
1100 1101 1110
1111+ 0010 0011
[4,5,6,7] S2 SC-RO CLK Divisions: 1,2,4,8,16,128
© 2011 Center for Embedded Systems for Critical Applications
13 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
E. Shmoo Plot of SHA-3 ASIC
0.8 1.0 1.2 1.4 1.6
50
100
150
200
250
300
Freq
uen
cy [
MH
z] (
10
MH
z/D
iv.)
Core Supply Voltage [V] (0.2V/Div., Standard: 1.2 V)
Passed Failed
0.8 1.0 1.2 1.4 1.6
50
100
150
200
250
300
BLAKE-256
Target @125MHz
0.8 1.0 1.2 1.4 1.6
100
150
200
250
300
350
50
Grostl-256
Target @200MHz
JH-256
Target @250MHz
Target Freq.
© 2011 Center for Embedded Systems for Critical Applications
14 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
0.8 1.0 1.2 1.4 1.6
100
150
200
250
300
350
50
100
150
200
250
300
0.8 1.0 1.2 1.4 1.6
100
150
200
250
300
350
0.8 1.0 1.2 1.4 1.6
Keccak-256
Target @250MHz
Skein-256
Target @125MHz
SHA256
Target @125MHz
Freq
uen
cy [
MH
z] (
10
MH
z/D
iv.)
Core Supply Voltage [V] (0.2V/Div., Standard: 1.2 V)
Passed Failed Target Freq.
F. SHA-3 Finalists Testing Results
Table 4 ASIC characterization of the SHA-3 ASIC chip
Block Core Lat. Areaa Max Freq. Tp
[bits] [cycles] [KGEs] [MHz] [Gbps]
BLAKE-256 512 30 34.15 125 2.13
Grøstl-256 512 11 124.34 200 9.31
JH-256 512 42 49.29 250 3.05
Keccak-256 1024 24 42.49 250 10.67
Skein512-256 512 21 66.36 125 3.05
SHA256 512 68 21.67 200 1.51
© 2011 Center for Embedded Systems for Critical Applications
15 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
Tp/Area Powerb Energyb Powerc Energyc
[kbps/GE] [mW] [mJ/Gbits] [mW] [mJ/Gbits]
BLAKE-256 62.47 21.33 25.00 19.77 23.17
Grøstl-256 74.87 78.42 33.70 139.29 59.85
JH-256 61.83 12.57 20.63 13.01 21.35
Keccak-256 251.05 19.12 8.96 19.78 9.27
Skein512-256 45.93 31.74 26.04 51.09 41.91
SHA256 69.54 5.18 13.76 5.05 13.42
a: the Gate Equivalent count is calculated by dividing the post-layout die area by the area of a NAND2XLTF (5.76 µm2). b: numbers are based on chip measurements of SHA-3 ASIC with slow chip interface clock at 1.5MHz and fast hash core clock at 50MHz. c: numbers are based on post-layout simulation of SHA-3 ASIC with slow chip interface clock at 1.5MHz and fast hash core clock at 50MHz. Note: 1: All five SHA-3 candidates are implemented with NIST SHA-3 Round 3 Specifications by January, 2011. 2: Each design’s static power is estimated by multiplying the whole chip static power, 1.92mW, with the area ratio of each design.
© 2011 Center for Embedded Systems for Critical Applications
16 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
IV Further Information
Further inquiries can be send to
Dr. Patrick Schaumont or Dr. Leyla Nazhandali
Center for Embedded Systems for Critical Applications (CESCA)
Bradley Department of Electrical and Computer Engineering, Virginia Tech
Blacksburg, VA 24060, USA
Email: [email protected]
Related publications can be consulted online at
http://rijndael.ece.vt.edu/sha3
This chip should be referred to in publications as
X. Guo, M. Srivistav, S. Huang, D. Ganta, M. B. Henry, L. Nazhandali, and
P. Schaumont, "ASIC Implementations of Five SHA-3 Finalists," Design, Automation and
Test in Europe (DATE2012), March 2012.
X. Guo, M. Srivistav, S. Huang, D. Ganta, M. B. Henry, L. Nazhandali, and
P. Schaumont, "Pre-silicon Characterization of NIST SHA-3 Final Round Candidates,"
14th Euromicro Conference on Digital System Design Architectures, Methods and Tools
(DSD 2011), August 2011.
© 2011 Center for Embedded Systems for Critical Applications
17 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
V References
[1] Z. Chen, S. Morozov, and P. Schaumont, “A Hardware Interface for Hashing Algorithms,”
Cryptology ePrint Archive, Report 2008/529, 2008, http://eprint.iacr.org/2008/529
[2] AIST-RCIS, “SHA-3 hardware project,” May 2011,
http://www.rcis.aist.go.jp/special/SASEBO/SHA3-en.html
[3] X. Guo, Meeta Srivastava, Sinan Huang, Dinesh Ganta, Michael B. Henry, Leyla Nazhandali,
and Patrick Schaumont, “Performance Evaluation of Cryptographic Hardware and Software –
Performance Evaluation of SHA-3 Candidates in ASIC and FPGA,” May 2011,
http://rijndael.ece.vt.edu/sha3/
[4] G. Bertoni, J. Daemen, M. Peeters, and G. V. Assche, “The Keccak sponge function family –
Updated VHDL package,” May 2011, http://keccak.noekeon.org/VHDL 3.0.html
[5] X. Guo, M. Srivastav, S. Huang, D. Ganta, M. B. Henry, L. Nazhandali, and P. Schaumont,
“Silicon Implementation of SHA-3 Finalists: BLAKE, Grøstl, JH, Keccak and Skein,” in ECRYPT II
Hash Workshop 2011, May 2011.
[6] X. Guo, M. Srivistav, S. Huang, Dinesh Ganta, M. B. Henry, L. Nazhandali, and P. Schaumont,
"Pre-silicon Characterization of NIST SHA-3 Final Round Candidates", 14th Euromicro Conference
on Digital System Design Architectures, Methods and Tools (DSD 2011), August 2011.
[7] X. Guo, M. Srivistav, S. Huang, Dinesh Ganta, M. B. Henry, L. Nazhandali, and P. Schaumont,
"ASIC Implementations of Five SHA-3 Finalists," Design, Automation and Test in Europe
(DATE2012), March 2012.
© 2011 Center for Embedded Systems for Critical Applications
18 / 18 http://rijndael.ece.vt.edu/sha3/ Datasheet (V1.1) Nov. 29, 2011
Revision History
The following table shows the revision history for this document.
Date Author Version Revision
08/14/2011 Xu Guo 1.0 Initial release.
11/29/2011 Xu Guo 1.1 Update with new related publications.