PBKDF2 Accelerator Design Review
description
Transcript of PBKDF2 Accelerator Design Review
PBKDF2 AcceleratorDesign Review
Akshay Sahni, William Ehlhardt, Yicheng Guo
Overview
It implements a particular PBKDF in hardware via ASIC to significantly erode the computational cost
We would implement PBKDF2 using the HMAC-SHA1 pseudorandom function
PBKDF2 is a key derivation function that is a part of RSA labs Public-Key Cryptography Standards series.
PBKDF2 applies a pseudorandom function, such as SHA-1 to the input password along with a salt value and repeats the process many times to produce a Derived Key.
Derived Key can then be used as a cryptographic key in subsequent operations.
The added computational work makes password cracking much more difficult, and is known as key stretching.
PBKDF2-HMAC-SHA1 on an ASIC
PBKDF2(P, S, c, dkLen)
F(Ko, U1, c)
PRF(Ko, data)=
HMAC(Ko, data)
SHA-1
ASIC
Host Ko U1 c
System Interface I
System Interface II
Flow Chart I
Flow Chart II
Architecture Diagram
Falling Edge Detect Block
The Falling Edge Detect Block will detect any falling edge on the input IE signal
Once a falling edge on IE signal is detected, it will assert GO signal high for one clock cycle.
clk
RST_NIE
clkRN_Prevbit
PrevBitOutput Logic
RegisterRegister
RN_PrevBit
GORST_N
Input Shift Register Block
This Block will function as a buffer to receive all 88 bytes of input signals from the 32-bit bidirectional data bus.
As long as the IE signal is high, it will shift 4 bytes of signal on the data bus to other functional blocks in the chip.
88 Byte Shit RegisterCLK
RST_N
K0
U1C
512
160
32
IE
BUS_IN32
Output Shift Register Block
This block functions as an output buffer
When it is enabled by the Output Enable(OE)signal, will shift the data onto the 32-bit data bus.
20 Byte Shit RegisterCLKRST_N
BUS_OUT
32
SR_LOAD
ACC160
OE
HashPrep Block
Generates the next 672 bit vector to be hashed
K0
512
XOR
XOR
IPAD
OPAD
HRES
UI
0
1
HPSTEP
HDATA
512
512
672
672
672
Hash Block
ChunkID 0 1
HDATA<671:160> HDATA<159:0>
Padding
WordExt F / KComputation
Stir
RoundCTR
HRES
Accumulator Block It keeps track of the following:
Each UI vector Accumulated xor result of precious ACC signal New HRES signal
register
Next count logic
GO
UI
2
STATE
UI next logic
NEXT_STATEACC_STROBE
160
ACC
160
ACC_nxt_logic160
HRES
160HRES
U1160
ACC_NXT
UI_NXT
160
160
Counter Block
Counts the number of inner PRF iterations performed
When iterations equal the required by the input, it asserts CNTDONE
CNTDONE stops the PBKDF2 algorithm
registerINCR
32
COUNT
Comparator 32
32C
CNTDONE_NXT
register
Next countlogic
CNTDONE32
COUNT
Comparator 32
COUNT
32C
GO
Control Unit Block
This block functions as a state machine to control the operation by the other blocks on the chip
Next state logic
register
CLK
RST_N
OutputLogicSTATE
NEXTSTATE4
4
INCR
SR_LOAD
HPSTEP
ACC_STROBE
RESULT_READY
IDLE
HRUN
GO
HRDY
OE
CNTDONE
Area Budgeting Table
Functional Block # of FF w/ reset # of FF w/o reset # of gates Area in mm2(Estimated) Area in mm2(Synthesized if any)
Input Shift Register 0 704 2112 2.550 N/AFalling Edge Detector 2 0 1 0.056 N/AOutput Shift Register 0 160 480 0.576 N/A
Counter 33 0 258 0.270 0.115Accumulator 322 0 1452 1.862 0.860Control_Unit 4 0 96 0.082 0.026
HashPrep 0 0 1856 1.392 0.525Hash 0 322 2752 2.064 N/ATotal 361 1186 9007 8.852 1.53+Un-synthesized blocks
Start Component Tp Combinational Logic Tp (in ns) End Component Tsetup/Tp Total Delay Clock Period(in ns) (in ns) (in ns) (in ns)
Counter Register 0.1 Next Count Logic 6.4 Counter Register 0.2 6.7 N/AControl Register 0.1 Control Output/Nextstate Logic 6.8 Counter Register 0.2 7.1 N/A
Accumulator Register 0.1 Hash Prep 0.6 Hash Block 0.2 0.9 N/AFalling Edge Detector 0.1 Falling edge/Control Unit 0.8 Control Unit Register 0.2 1.3 N/A
Hash Prep 0.1 Hash Prep/Hash Block 420418 Hash Block 0.2 420418 N/A
Timing Budgeting Table
Questions
(and Answers!)