Cryptographic Hash functions Hash Functions RIPEMD-160 and
Transcript of Cryptographic Hash functions Hash Functions RIPEMD-160 and
1
Cryptographic Hash Functions and the SHA-3
Competition
Bart Preneel
bart.preneel(AT)esat.kuleuven.be
COSIC/Kath. Univ. Leuven (Belgium)
12
Hash functions
X.509 Annex DMDC-2MD2, MD4, MD5SHA-1
This is an input to a crypto-graphic hash function. The input is a very long string, that is reduced by the hash function to a string of fixed length. There are additional security conditions: it should be very hard to find an input hashing to a given value (a preimage) or to find two colliding inputs (a collision).
1A3FD4128A198FB3CA345932h
RIPEMD-160SHA-256SHA-512
SHA-3
3
Applications
⢠short unique identifier to a stringâ digital signaturesâ data authentication
⢠one-way function of a stringâ protection of passwordsâ micro-payments
⢠confirmation of knowledge/commitment
⢠pseudo-random string generation/key derivation⢠entropy extraction⢠construction of MAC algorithms, stream ciphers, block
ciphers,âŚ
2005: 800 uses of MD5 in Microsoft Windows 4
Agenda
⢠Definitions⢠Iterations (modes)⢠Compression functions⢠MD5 + SHA-{0,1,2}⢠SHA-3 bits and bytes
5
Hash function flavors
cryptographic hash function
MDCMAC
OWHF CRHFUOWHF
(TCR)
this talk
6
Informal definitions (1)
⢠no secret parameters⢠input string x of arbitrary length â output h(x) of
fixed bitlength n⢠computation âeasyâ
⢠One Way Hash Function (OWHF)â preimage resistanceâ 2nd preimage resistance
⢠Collision Resistant Hash Function (CRHF): OWHF +â collision resistant
7
Security requirements (n-bit result)
h
?
h(x)
h
x
h(x)
h
?
h(xâ)
h
?
h
?
=
â
=
preimage 2nd preimage collision
2n 2n 2n/2
â
h(xâ)h(x)
8
Preimage resistance
h
?
h(x)
preimage
2n
⢠in a password file, one does not storeâ (username, password)
⢠butâ (username,hash(password))
⢠this is sufficient to verify a password⢠an attacker with access to the
password file has to find a preimage
9
Second preimage resistance
h
x
h(x)
h
?
h(xâ)=
2nd preimage
2n
â
⢠an attacker can modify x but not h(x)⢠he can only fool the recipient if he
finds a second preimage of x
h(x)
Channel 2: low capacity but secure (= authenticated â cannot be modified)
x
Channel 1: high capacity and insecure
10
Collision resistance (1/2)
hh
x
=
â collision
2n/2
h(xâ)h(x)
⢠hacker Alice prepares two versions of a software driver for the O/S company Bobâ x is correct codeâ xâ contains a backdoor that gives Alice
access to the machine
⢠Alice submits x for inspection to Bob
xâ
⢠if Bob is satisfied, he digitally signs h(x) with his private key
⢠Alice now distributes xâ to users of the O/S; these users verify the signature with Bobâs public key
⢠this signature works for x and for xâ, since h(x) = h(xâ)!
11
Collision resistance (2/2)
hh
x
=
â collision
2n/2
h(xâ)h(x)
⢠in many cryptographic protocols, Alice wants to commit to a value x without revealing it
⢠Alice picks a secret random string r and sends y = h(x || r) to Bob
xâ
⢠in a later phase of the protocol, Alice reveals x and r to Bob and he checks that y is correct
⢠if Alice can find a collision, that is (x,r) and (xâ,râ) with xâ â x she can cheat
⢠if Bob can find a preimage, he can learn x and cheat
12
Brute force (2nd) preimage
⢠multiple target second preimage (1 out of many): â if one can attack 2t simultaneous targets, the effort to find a single
preimage is 2n-t
⢠multiple target second preimage (many out of many): â time-memory trade-off with Î(2n) precomputation and
storage Î(22n/3) time per (2nd) preimage: Î(22n/3) [Hellmanâ80]
⢠answer: randomize hash function with a parameter S (salt, key, spice,âŚ)
13
How to find collisions?
I = space of pairs of messages; size â (2264) 2
C = space of all input messages that collide under h
|C| â 2-n | I |
I
C
Collision search algorithm 1
Pick 2n random message pairs (x,xâ)
For each pair, Prob(h(x)=h(xâ)=2-n)
You expect to find a collision, that is, a non-empty intersection with C
T
14
How to find collisions?
I
C
Collision search algorithm 2
Pick a set R of 2n/2 random messages
Find a collision
You expect to find a collision, that is, a non-empty intersection with C as there are about 2n/2 distinct pairs in R
R
I = space of pairs of messages; size â (2264) 2
C = space of all input messages that collide under h
|C| â 2-n | I |
15
The birthday paradox
⢠given a set with S elements⢠choose r elements at random (with replacements)
with r  S⢠the probability p that there are at least 2 equal
elements (a collision) â 1 - exp (- r(r-1)/2S)⢠more precisely, it can be shown that
â p ⼠1 - exp (- r(r-1)/2S)â if r < â2S then p ⼠0.6 r (r-1)/2S
16
Brute force collision search
⢠Consider the functional graph of hh(x)x h
collision
h(x) h2(x)
xh(x)
h2(x)
17
Functional graph of f(x) = x2 + 7 mod 11
⢠Exercise: why is the indegree of 5 nodes equal to 0 resp. 2?
9 2
74
1
8
510
36
0
18
Brute force collision search
⢠low memory and parallel implementation of the birthday attack [Pollardâ78][Quisquaterâ89][Wiener-van Oorschotâ94]
⢠distinguished point (d bits) â Î(e2n/2 + e 2d+1) steps with e the cost of one
function evaluationâ Î(n2n/2-d) memory l
c
l = c = (Ď/8) 2n/2
h(x)x h
19
Brute force attacks in practice
⢠(2nd) preimage searchâ n = 128: 23 B$ for 1 year if one can attack 240 targets in
parallel
⢠parallel collision search (memoryless!)â n = 128: 1 M$ for 8 hours (or 1 year on 100K PCs)â n = 160: 90 M$ for 1 yearâ need 256-bit result for long term security (30 years or more)
20
Quantum computers
⢠in principle exponential parallelism⢠inverting a one-way function: 2n reduced to 2n/2 [Groverâ96]
⢠collision search:â 2n/3 computation + hardware [Brassard-Hoyer-Tappâ98]â [Bernsteinâ09] classical collision search requires 2n/4 computation
and hardware (= standard cost of 2n/2 )
21
Collision resistance
⢠hard to achieve in practiceâ many attacksâ requires double output length 2n/2 versus 2n
⢠hard to achieve in theoryâ [Simonâ98] one cannot derive collision resistance from âgeneralâ
preimage resistance (there exists no black box reduction)
⢠hard to formalize: requires â family of functions: key, parameter, salt, spice,âŚâ âhuman ignoranceâ trick [Stinsonâ06], [Rogawayâ06]
2122
Relation between properties
[Rogaway-Shrimptonâ04]
[Stinsonâ06]
[Reyhanitabar-Susilo-Muâ10]
23
Properties in practice
⢠collision resistance is not always necessary⢠other properties are needed:
â pseudo-randomness if keyed (with secret key)â near-collision resistanceâ partial preimage resistanceâ multiplication freeness â pseudo-random oracle property
⢠how to formalize these requirements and the relation between them?
24
Agenda
⢠Definitions⢠Iterations (modes)⢠Compression functions⢠MD5 + SHA-{0,1,2}⢠SHA-3 bits and bytes
25
How not to construct a hash function
⢠Divide the message into t blocks xi of n bits each
Message block 1: x1
âMessage block 1: x2
â
Message block 1: xt
=
â
Hash value h(x)
âŚ
26
Hash function: iterated structure
Split messages into blocks of fixed length and hash them block by block with a compression function f
Efficient and elegantBut âŚ
f
x1
IVf
x2
H1f
x3
H2f
x4
H3g
27
Security relation between f and h
⢠iterating f can degrade its securityâ trivial example: 2nd preimage
fx1
IVf
x2
H1f
x3
H2f
x4
H3 g
fx2
IV = H1f
x3
H2f
x4
H3 g
28
Security relation between f and h (2)
⢠solution: Merkle-DamgĂĽrd (MD) strengthening â fix IV, use unambiguous padding and insert length at the end
⢠f is collision resistant â h is collision resistant[Merkleâ89-DamgĂĽrdâ89]
⢠f is ideally 2nd preimage resistant â h is ideally 2nd
preimage resistant [Lai-Masseyâ92] ?
⢠few hash functions have a strong compression function
⢠very few hash functions treat xi and Hi-1 in the same way
29
Security relation between f and h (3)
length extension: if one knows h(x), easy to compute h(x || y) without knowing x
f
x1
IVf
x2
H1f
x3
H2f
x4
H3g
solution: output transformation
fx1
IVf
x2
H1
fx3
H2 H3= h(x)
fx1
IVf
x2
H1
fx3
H2f
y
H3 H4= h(x || y)
30
Selected attacks on MD: 1999-2005
⢠multi-collision attack and impact on concatenation[Jouxâ04]
⢠long message 2nd preimage attack: faster for messages with more than 2n/2 blocks [Dean-Felten-Hu'99], [Kelsey-Schneierâ05]
⢠[âŚ]
31
How (NOT) to strengthen a hash function?[Jouxâ04]
⢠answer: concatenation⢠h1 (n1-bit result) and h2 (n2-bit result)
h2h1
g(x) = h1(x) || h2(x)
⢠intuition: the strength of g against collision/(2nd) preimage attacks is the product of the strength of h1 and h2
â if both are âindependentâ
⢠BUT: collision attack requires at most n1 . 2n2/2 + 2n1/2 << 2(n1 + n2)/2
= as most as strong as the strongest of the 2
32
Summary
33
Improving MD iteration
salt + output transformation + counter + wide pipe
f
x1
IVf
x2
H1
f
x3
H2
f
x4
H3 g
1
salt salt salt salt salt
|x|
security reductions better understoodmany more results on property preservation
2 3 4
2n2n 2n 2n 2n n
34
Improving MD iteration
⢠degradation with use: salting (family of functions, randomization)
⢠extension attack + PRO preservation: strong output transformation g (which includes total length and salt)
⢠long message 2nd preimage: preclude fix pointsâ counter f â fi [Biham-Dunkelman]
⢠multi-collisions, herding: avoid breakdown at 2n/2 with larger internal memory: known as wide pipeâ e.g., extended MD4, RIPEMD, [Lucksâ05]
35
Agenda
⢠Definitions⢠Iterations (modes)⢠Compression functions⢠SHA-{0,1,2}⢠SHA-3 bits and bytes
36
Block cipher (EK) based
Davies-Meyer
xi E
Hi-1
Hi
Miyaguchi-Preneel
xi E
Hi-1
Hi
⢠output length = block length
⢠12 secure compression functions in ideal cipher model
⢠requires 1 key schedule per encryption
37
Permutation (Ď) based: sponge type
Examples: Panama, RadioGatun, Grindahl, Keccak (no buffer = real sponge)
x1
Ď
H10
H20
x2
Ď
x3
Ď
x4
Ď Ď Ď Ď
h1
Ď
h2
absorb buffer squeeze
38
Permutation (Ď) based
small permutation
JHxi
ĎH1i-1 H1i
H2iH2i-1Hi
Grøstl
xi
Ď2Hi-1
Ď1
39
Iteration modes
⢠security of simple modes well understood⢠powerful tools available
⢠analysis of slightly more complex schemes very difficult
⢠MD versus sponge is still open debate
40
Agenda
⢠Definitions⢠Iterations (modes)⢠Compression functions⢠MD5 + SHA-{0,1,2}⢠SHA-3 bits and bytes
41
Hash function history 101
1980
1990
2000
2010
HAR
DW
ARE
SO
FTW
ARE
DES
AES
single block length
double block length
permu-tations
RSA
ad hoc schemes
security reduction for factoring, DLOG, lattices
MD2 MD4 MD5
SHA-1
RIPEMD-160
SHA-2
Whirlpool
SHA-3
SNEFRU
Dedicated
42
Performance of hash functions [Bernstein](cycles/byte) AMD Intel Pentium D 2992 MHz (f64)
0
5
10
15
20
25
30
35
40
45
MD4 SHA-1 DES SHA-512
AESMD5 RMD-160
SHA-256
Whirl-pool
AES- hash(estimated)
2001
43
MDx-type hash function history
MD5
SHA
SHA-1
SHA-256SHA-512
HAVAL
Ext. MD4
RIPEMD
RIPEMD-160
MD4 90
91
92
93
9495
0244
The complexity of collision attacks
0102030405060708090
1992
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
MD4MD5SHA-0SHA-1Brute force
brute force: 1 million PCs (1 year) or US$ 100,000 hardware (4 days)
45
MD5 [Rivestâ91]4 rounds of 16 steps
A0 B0 C0 D0
A1 B1 C1 D1
A16 B16 C16 D16
x0
x15
A17 B17 C17 D17
A32 B32 C32 D32xp(15)
xp(0)
A33 B33 C33 D33
A48 B48 C48 D48xq(15)
xq(0)
A49 B49 C49 D49
A64 B64 C64 D64xr(15)
xr(0)
âŚ
âŚ
âŚ
âŚf
f
g
g
h
h
j
j
+
H i-1
H i
xi
Ki
46
MD5
⢠pseudo-collisions [denBoer-Bosselaersâ93] ⢠collisions for compression function [Dobbertinâ96]
⢠collisions for hash functionâ [Wang+â04] â 15 minutesâ âŚâ [Stevens+â09] â millisecondsâ brute force (264): 1M$ 8 hours in 2010
⢠2nd preimage in 2123 [Sasaki-Aokiâ09]
47
MD5
⢠advice (RIPE since â92, RSA since â96): stop using MD5
⢠largely ignored by industry until 2009 (click on a cert...)
48
SHA-1
0102030405060708090
2003 2004 2005 2006 2007 2008 2009 2010
SHA-1
[Wang+â04]
[Wang+â05][Mendel+â08]
[McDonald+â09]
[Manuel+â09]
Most attacks unpublished/withdrawn
[Sugita+â06]
log2 complexity
prediction: collision for SHA-1 in the next 12-18 months
49
NIST and SHA-1
50
Impact of collisions
⢠collisions for MD5, SHA-0, SHA-1â 2 messages differ in a few bits in 1 to 3 512-bit input blocksâ limited control over message bits in these blocksâ but arbitrary choice of bits before and after them
⢠what is achievable for MD5?â 2 colliding executables/postscript/gif/âŚ[Lucks-Daumâ05]â 2 colliding RSA public keys â thus with colliding X.509 certificates
[Lenstra+â04]â chosen prefix attack: different IDs, same certificate [Stevens+â07]â 2 arbitrary colliding files (no constraints) in 8 hours for 1 M$
51
Rogue CA attack[Sotirov-Stevens-Appelbaum-Lenstra-Molnar-Osvik-de Weger â08]
Self-signed root key
CA1 CA2 Rogue CA
User1 User2 User x
⢠request user cert; by special collision this results in a fake CA cert (need to predict serial number + validity period)
⢠6 CAs have issued certificates signed with MD5 in 2008:â Rapid SSL, Free SSL (free trial certificates offered by RapidSSL), TC TrustCenter
AG, RSA Data Security, Verisign.co.jp
⢠6 CAs have issued certificates signed with MD5 in 2008:â Rapid SSL, Free SSL (free trial certificates offered by RapidSSL), TC TrustCenter
AG, RSA Data Security, Verisign.co.jp
impact: rogue CAthat can issue certsthat are trusted by all browsers
impact: rogue CAthat can issue certsthat are trusted by all browsers
52
Impact of MD5 collisions
⢠digital signatures: only an issue if for non-repudiation
⢠none for signatures computed before attacks were public (1 August 2004)
⢠substantial for signatures after 1 August 2005 (cf. traffic tickets in Australia)
⢠can be problematic for certificates (if opponent has more some control over public key and over collision)
53
And (2nd) preimages?⢠security degrades with number of applications⢠for large messages even with the number of
blocks (cf. supra)⢠specific results:
â MD2: 273 [Knudsen+09]â MD5: 2123 [Sasaki-Aokiâ09]â SHA-0: 52 of 80 steps in 2156.6 [Aoki-Sasakiâ09]â SHA-1: 48 of 80 steps in 2159.3 [Aoki-Sasakiâ09]
most applications require (2nd) preimage resistance only54
HMAC
⢠HMAC keys through the IV (plaintext) â collisions for MD5 invalidate current security proof of HMAC-MD5
Rounds in f2 Rounds in f1 Data complexity
MD4 48 48 272 CP + 277 timeMD5 64 33 of 64 2126.1 CPMD5 64 64 251 CP & 2100 time (RK)SHA-0 80 80 2109 CPSHA-1 80 53 of 80 298.5 CP
f2
f1
xK1
K2
55
Upgrades
⢠RIPEMD-160 is good replacement for SHA-1
⢠upgrading algorithms is always hard
⢠TLS uses MD5 || SHA-1 to protect algorithm negotiation
⢠upgrading negotiation algorithm is even harder: need to upgrade TLS 1.1 to TLS 1.2
56
SHA-2 [NISTâ02]
⢠SHA-224, SHA-256, SHA-384, SHA-512â non-linear message expansionâ more complex operationsâ 64/80 stepsâ SHA-384 and SHA-512: 64-bit architectures
⢠SHA-256 collisions: 24/64 steps [Sanadhya-Sarkarâ08]
⢠SHA-256 preimages: 43/64 steps [Aoki+â09]
⢠implementations today faster than anticipated
⢠adoptionâ industry may migrate to SHA-2 by 2011 or may wait for SHA-3 â very slow for TLS/IPsec (no pressing need)
57
Agenda
⢠Definitions⢠Iterations (modes)⢠Compression functions⢠SHA-{0,1,2}⢠SHA-3 bits and bytes
58
NIST AHS competition (SHA-3)
⢠SHA-3 must support 224, 256, 384, and 512-bit message digests, and must support a maximum message length of at least 264 bits
6451
145 1
020406080
Q4/08 Q3/09 Q4/10 Q3/12
round 1 round 2 final
Call: 02/11/07
Deadline (64): 31/10/08
Round 1 (51): 9/12/08
Round 2 (14): 24/7/09
Standard: Q3/2012
59
The Candidates
Slide credit: Christophe De Cannière
60
Preliminary Cryptanalysis
Slide credit: Christophe De Cannière
61
End of Round 1 Candidates
a
Slide credit: Christophe De Cannière
62
Round 2 Candidates
a
Slide credit: Christophe De Cannière
63
Properties: bits and bytes[Watanabeâ10]
64
Compression function/iteration
SpongeSpongeSponge
2-permutationSponge
Sponge
Permutation MD/HAIFABlock cipher
JH-specificJH
Luffa
MD/TreeDavies-MeyerSkeinMDPGV variantSIMD
HAIFADavies-MeyerShavite-3Shabal
Keccak
HamsiMDGrøstl
FugueHAIFAECHO
CubehashMDPGV variantBMW
HAIFABlake
65
Proofs
⢠Compression functions (collisions and preimages)â 4/14 weak (by design)â 5/14 have a reduction proof
⢠Hash functionsâ Collisions: most functions have a preservation proof, but not
always tightâ Second preimage: few have a preservation proofâ PRO (Pseudo-random oracle): most have a preservation proof
66
Security Reductions[Mennink-Andreeva-Preneelâ10]
67
Security: SHA-3 Zoohttp://ehash.iaik.tugraz.at/wiki/The_SHA-3_Zoo
68
Software benchmarking[Bernsteinâ10]
69
Software performance[Bernstein10] http://bench.cr.yp.to/ebash.html
cycles/byte on 3.2 GHz, AMD Phenom II X6 1090T (100fa0)
0
10
20
30
40
50
60
Blake ECHO Hamsi Luffa Simd
512/256-bit hash
64-bit machine so 512-bit version is oftenfaster
BMWCubehash
FugeGroestl
JHKeccak
ShabalShavite-3
SkeinSHA-2
SHA-2
70
0
5
10
15
20
25
0 50 100 150
Hardware Performance[Tillich+â09] IACR ePrint 2009/510
Luffa
Grøstl
Skein
Keccak
Size (kGe)
Throughput (Gbps)
71
Issues arisen during Round 1
⢠round 1 was very short; several functions received no outside analysis
⢠security: â controversy around pseudo-collision attacks and memory
requirementsâ proofs have not helped much to survive
72
Issues arisen during Round 2
⢠security: â few real attacks but some weaknessesâ new design ideas harder to validateâ very few provable properties
⢠performance: roughly as fast or faster than SHA-2â SHA-2 gets faster every dayâ widely different results for hardware and software
⢠software: large difference between high end and embedded⢠hardware: FGPA and ASIC
⢠diversity = third criterion for the final
⢠NIST expects that SHA-2 and SHA-3 will co-exist
73
SHA-4?
⢠an open competition such as SHA-3 is bound to result in new insights between 2009-2012
⢠only few of these can be incorporated using âtweaksâ
⢠the winner selected in 2012 will reflect the state of the art in October 2008
⢠nevertheless, it is unlikely that we will have a SHA-4 competition before 2030
74
Hash functions: conclusions
⢠Fascinating research area with high impact⢠SHA-1 would have needed 128-160 steps instead of 80⢠2004-2009 attacks: cryptographic meltdown but not
dramatic for most applicationsâ clear warning: upgrade asap
⢠theory is developing for more robust iteration modes and extra features; still early for building blocks
⢠many challenges remain in cryptanalysis⢠Nirwana: efficient hash functions with security reduction