User Authentication and Cryptographic Primitives

User Authentication and Cryptographic Primitives

Brad KarpUCL Computer Science

CS GZ03 / M03011th November, 2008

2

Outline

• Authenticating users– Local users: hashed passwords– Remote users: s/key– Unexpected covert channel: the Tenex

password-guessing attack• Symmetric-key-cryptography• Public-key cryptography usage model• RSA algorithm for public-key cryptography

– Number theory background– Algorithm definition– Cryptographic strength of RSA– Ease of misusing RSA

3

Authentication of Local Users

• Goal: only file’s owner can access file• UNIX authentication policy:

– Each file has an owner principal: an integer user ID– Each file has associated owner permissions (read,

write, execute, &c.)– Each process runs with integer user ID; only can

access file as owner if matches file’s owner user ID– OS assigns user ID to user’s shell process at login

time, authenticated by username and password– Shell process creates new child processes with

same user ID• How does UNIX know the correspondence

among <username, user ID, password>, for all users?

4

Straw Man:Plaintext Password Database

• Keep password database in a file, e.g.:bkarp:3715:secretpw

mjh:4212:multicast

• Passwords stored in file in plaintext• Make file readable only by privileged

superuser (root)• /bin/login program prompts for usernames

and passwords on console; runs as root, so can read password database

• How well does this scheme meet original goal?

5

Cryptographic Primitive:Cryptographic Hash Function

• Don’t want someone who sees the password database to learn users’ passwords

• Cryptographic hash function, y=H(x) such that:– H() is preimage-resistant: given y, and with

knowledge of H(), computationally infeasible to recover x

– H() is second-preimage-resistant: given y, computationally infeasible to find x’x s.t. H(x)=H(x’)=y

• Widely used cryptographic hash functions:– MD-5: output is 128 bits, broken– SHA-1: output is 160 bits; on verge of being broken– SHA-256: output is 256 bits, best current practice

6

Better Plan:Hashed Password Database

• Keep password database in a file:bkarp:3715:Xc8zOP0ZHJkpmjh:4212:p6FsAtQl4cwi

• Instead of password plaintext x, store H(x)

• Make file readable by all (!)• One-wayness of H() means no one

can recover x from H(x), right?– WRONG! Users choose memorable

passwords…

7

Insight: Counting Possible Passwords

• If users pick random n-character passwords using c possible characters, how many guesses expected to guess one password?

cn/2e.g., 8 characters, each ~90 possibilities, 2.15 x

1015

• Do users pick random passwords?– Of course not; very hard to remember– Common choice: word in native language

• How many words in common use in modern English?– 50,000-70,000 (or far fewer, if you read Metro)

8

Dictionary Attack on Hashed Password Databases

• Suppose hacker obtains copy of password file (until recently, world-readable on UNIX)

• Compute H(x) for 50K common words• String compare resulting hashed words

against passwords in file• Learn all users’ passwords that are

common English words after only 50K computations of H(x)!

• Same hashed dictionary works on all password files in world!

9

Salted Password Hashes

• Generate a random string of bytes, r• For user password x, store [H(r,x), r] in

password file• Result: same password produces different

result on every machine– So must see password file before can hash

dictionary– …and single hashed dictionary won’t work for

multiple hosts

• Modern UNIX: password hashes salted; hashed password database readable only by root

Dictionary attack still possible after attacker sees password file!Users should pick passwords that aren’t close to dictionary words.

10

Tenex Password Attack:An Information Leak

• Tenex OS stored directory passwords in plaintext

• OS supported system call:– pw_validate(directory, pw)

• Implementation simply compared pw to stored password in directory, char by char

• Clever attack:– Make pw span two VM pages, put 1st char of guess in

first page, rest of guess in second page– See whether get a page fault—if not, try next value

for 1st char, &c.; if so, first char correct!– Now position 2nd char of guess at end of 1st page, &c.– Result: guess password in time linear in length!

Lessons:Don’t store passwords in cleartext.Information leaks are real, and can be extremely difficult to find and eliminate.

11

Remote User Authentication

• Consider the case where Alice wants to log in remotely, across LAN or WAN from server

• Suppose network links can be eavesdropped by adversary, Eve

• Want scheme immune to replay: if Eve overhears messages, shouldn’t be able to log in as Alice by repeating them to server

• Clear non-solutions:– Alice logs in by sending {alice, password}– Alice logs in by sending {alice, H(password)}

12

Remote User Authentication (2)

• Desirable properties:– Message from Alice must change

unpredictably at each login– Message from Alice must be verifiable at

server as matching secret value known only to Alice

• Can we achieve these properties using only a cryptographic hash function?

13

Remote User Authentication: s/key

• Denote by Hn(x) n successive applications of cryptographic hash function H() to x– i.e., H3(x) = H(H(H(x)))

• Store in server’s user database:alice:99:H99(password)

• At first login, Alice sends:{alice, H98(password)}

• Server then updates its database to contain:alice:98:H98(password)

• At next login, Alice sends:{alice, H97(password)}– and so on…

14

Properties of s/key

• Just as with any hashed password database, Alice must store her secret on the server securely (best if physically at server’s console)

• Alice must choose total number of logins at time of storing secret

• When logins all “used”, must store new secret on server securely again

15

Secrecy through Symmetric Encryption

• Two functions: E() encrypts, D() decrypts

• Parties share secret key K• For message M:

– E(K, M) C– D(K, C) M

• M is plaintext; C is ciphertext• Goal: attacker cannot derive M from C

without K

16

Idealized Symmetric Encryption:One-Time Pad

• Secretly share a truly random bit string P at sender and receiver

• Define as bit-wise XOR• C = E(M) = M P• M = D(C) = C P• Use bits of P only once; never use

them again!

17

Stream Ciphers:Pseudorandom Pads

• Generate pseudorandom bit sequence (stream) at sender and receiver from short key

• Encrypt and decrypt by XOR’ing message with sequence, as with one-time pad

• Most widely used stream cipher: RC4• Again, never, ever re-use bits from

pseudorandom sequence!• What’s wrong with reusing the stream?

– Alice Server: c1 = E(s, “Visa card number”)– Server Alice: c2 = E(s, “Transaction confirmed”)– Suppose Eve hears both messages– Eve can compute:

m = c1 c2 “Transaction confirmed”

18

Symmetric Encryption: Block Ciphers

• Divide plaintext into fixed-size blocks (typically 64 or 128 bits)

• Block cipher maps each plaintext block to same-length ciphertext block

• Best today to use AES (others include Blowfish, DES, …)

• Of course, message of arbitrary length; how to encrypt message of more than one block?

19

Using Block Ciphers: ECB Mode

• Electronic Code Book method• Divide message M into blocks of

cipher’s block size• Simply encrypt each block individually

using the cipher• Send each encrypted block to receiver• Presume cipher provides secrecy, so

attacker cannot decrypt any block• Does ECB mode provide secrecy?

20

Avoid ECB Mode!

• ECB mode does not provide robust secrecy!

• What if there are repeated blocks in the plaintext? Repeated as-is in ciphertext!

• What if sending sparse file, with long runs of zeroes? Non-zero regions obvious!

• WW II U-Boat example (Bob Morris):– Each day at same time, when no news, send

encrypted message: “Nichts zu melden.”– When there’s news, send the news at that time.– Obvious when there’s news– Many, many ciphertexts of same known plaintext

made available to adversary for cryptanalysis—a worry even if encryptions of same plaintext produce different ciphertexts!

21

Using Block Ciphers: CBC Mode

• Better plan: make encryptions of successive blocks depend on one another, and initialization vector known to receiver

22

Integrity with Symmetric Crypto:Message Authentication Codes

• How does receiver know if message modified en route?

• Message Authentication Code:– Sender and receiver share secret key K– On message M, v = MAC(K, M)– Attacker cannot produce valid {M, v} without K

• Append MAC to message for tamper-resistance:– Sender sends {M, MAC(K, M)}– M could be ciphertext, M = E(K’, m)– Receiver of {M, v} can verify that v = MAC(K,

M)• Beware replay attacks—replay of prior {M,

v} by Eve!

23

HMAC: A MAC Based on Cryptographic Hash Functions

• HMAC(K, M) =H(K opad . H(K ipad . M))

• where:– . denotes string concatenation– opad = 64 repetitions of 0x36– ipad = 64 repetitions of 0x5c– H() is a cryptographic hash function, like

SHA-256• Fixed-size output, even for long

messages

24

Public-Key Encryption: Interface

• Two keys:– Public key: K, published for all to see– Private (or secret) key: K-1, kept secret

• Encryption: E(K, M) {M}K

• Decryption: D(K-1, {M}K) M• Provides secrecy, like symmetric encryption:

– Can’t derive M from {M}K without knowing K-1

• Same public key used by all to encrypt all messages to same recipient– Can’t derive K-1 from K

25

Number Theory Background:Modular Arithmetic Primer (1)

• Recall the “mod” operator: returns remainder left after dividing one integer by another, the modulus– e.g., 15 mod 6 = 3

• That is:a mod n = r

which just meansa = kn + r for some integers k and r

• Note that 0 <= r < n

26

Modular Arithmetic Primer (2)

• In modular arithmetic, constrain range of integers to be only the residues [0, n-1], for modulus n– e.g., (12 + 13) mod 24 = 1– We may also write

• Modular arithmetic retains familiar properties: commutative, associative, distributive

• Same results whether mod taken at each arithmetic operation, or only at end, e.g.:(a + b) mod n = ((a mod n) + (b mod n)) mod n

(ab) mod n = (a mod n)(b mod n) mod n

€

12 +13 ≡1 ( 24)mod

27

Modular Arithmetic: Advantages

• Limits precision required: working mod n, where n is k bits long, any single arithmetic operation yields at most 2k bits– …so results of even seemingly expensive

ops, like exponentiation (ax) fit in same number of bits as original operand(s)

– Lower precision means faster arithmetic• Some operations in modular arithmetic

are computationally very difficult:– e.g., computing discrete logarithms:

find integer x s.t. n) (mod bax

Cryptography leverages “difficult” operations; want reversing encryption without key to be computationally intractable!

28

Modular Arithmetic: Inverses (1)

• In real arithmetic, every integer has a multiplicative inverse—its reciprocal—and their product is 1– e.g., 7x = 1 x = (1/7)

• What does an inverse in modular arithmetic (say, mod 11) look like?

– that is, 7x = 11k + 1 for some x and k– so x = 8 (where k = 5)

11 mod17x

29

Aside: Prime Numbers

• Recall: prime number is integer > 1 that is evenly divisible only by 1 and itself

• Two integers a and b are relatively prime if they share no common factors but 1; i.e., if gcd(a, b) = 1

• There are infinitely many primes• Large primes (512 bits and longer)

figure prominently in public-key cryptography

30

Modular Arithmetic: Inverses (2)

• In general, finding modular inverse means finding x s.t.

• Does modular inverse always exist?– No! Consider

• In general, when a and n are relatively prime, modular inverse x exists and is unique

• When a and n not relatively prime, x doesn’t exist

• When n prime, all of [1…n-1] relatively prime to n, and have an inverse in that range

n) (modx a-1

8) (modx 2-1

Algorithm to find modular inverse: extended Euclidean Algorithm. Tractable; requires O(log n) divisions.

31

Euler’s Phi Function: Efficient Modular Inverses on Relative

Primes• φ(n) = number of integers < n that are

relatively prime to n• If n prime, φ(n) = n-1• If n=pq, where p and q prime:

φ(n) = (p-1)(q-1)• If a and n relatively prime, Euler’s

generalization of Fermat’s little theorem:aφ(n) mod n = 1

• and thus, to find inverse x s.t. x = a-1 mod n:x = aφ(n)-1 mod n

32

RSA Algorithm (1)

• [Rivest, Shamir, Adleman, 1978]• Recall that public-key cryptosystems

use two keys per user:– K, the public key, made available to all– K-1, the private key, kept secret by user

33

RSA Algorithm (2)

• Choose two random, large primes, p and q, of equal length, and compute n=pq

• Randomly choose encryption key e, s.t. e and (p-1)(q-1) are relatively prime

• Use extended Euclidean algorithm to compute d, s.t. d = e-1 mod ((p-1)(q-1))

• Public key: K = (e, n)• Private key: K-1 = d• Discard p and q

34

RSA Algorithm (3)

• Encryption:– Divide message M into blocks mi, each

shorter than n

– Compute ciphertext blocks ci with:ci = mi

e mod n

• Decryption– Recover plaintext blocks mi with:

mi = cid mod n

35

Why Does RSA Decryption Recover Original Plaintext?

• Observe that cid = (mi

e)d = mied

• Note thatbecause e and d are inverses mod (p-1)(q-1)

• So: , and thus ed = k(p-1)+1 , and thus ed = h(q-1)+1

• Consider case where mi and p are relatively prime:

by Euler’s generalization of Fermat’s little theorem

– so• And case where mi a multiple of p:

• Thus in all cases,

€

ed≡1 ( (mod p- 1)(q - 1))

€

ed≡1 ( (mod p- 1))

€

ed≡1 ( (mod q- 1))

€

mied=mi

(k p-1)+1 =mi(mi(p-1) )k ≡mi ( )mod p

€

mi(p-1) ≡1 ( )mod p

€

mied=0ed=0 ≡mi ( )mod p

€

mied ≡mi ( )mod p

Why Does RSA Decryption Recover Original Plaintext? (2)

• Similarly, • Now:

• Because p, q both prime and distinct:

• So

36

€

mied ≡mi ( )mod q

€

mied -mi ≡0 ( )mod p

€

mied -mi ≡0 ( )mod q

€

mied -mi ≡0 ( ( ))mod pq

€

cid =mi

ed ≡mi ( )mod n

37

Misuses of RSA Break Secrecy

• When encrypting, what if plaintext drawn from very small set (e.g., {“yes”, “no”})?

• Employees escrow secret documents, encrypted with company’s public key– Upon firing or death of one employee,

company releases plaintext to another– Employee E takes employee A’s ciphertext

c = me mod n, escrows c2e mod n– Employee E fired; co-conspirator F gets 2m!

• Chosen ciphertext attack (CCA): eavesdrop a ciphertext c; submit specially concocted messages for decryption; study resulting plaintexts; learn plaintext, m = cd mod n

38

RSA: Not Quite Exponentiation

• At first glance, RSA operations appear to be raising a message to a power

• But they’re not, really…the mod n means RSA in fact a trap-door permutation– Map one element, m, of set {0, …, n-1} to

another, c– Not invertible without knowing d

• Non-invertibility applies to whole of m and c; not to individual bits of m and c, or other properties over m and c, e.g., parity of m– In escrow attack, multiplicative relationship

among RSA ciphertexts exists, despite non-invertibility

• It’s possible that learning even one bit of m may help recover all of m from c

39

Adaptive Chosen Ciphertext Attack on RSA in SSL 3.0

• SSL 3.0 encrypted with RSA by padding plaintext into blocks using PKCS #1 standard, as follows:– 0x00 | 0x02 |

8 or more non-zero random bytes | 0x00 |plaintext block

• SSL decrypts received ciphertext, checks if result in this format; returns “format error” if not!

• Bleichenbacher’s adaptive CCA attack: with about one million messages to server, attacker can recover m for previously eavesdropped ciphertext c = me mod n– When chosen ciphertext accepted by server, attacker

knows first two plaintext bytes with certainty!

40

Making RSA Secure Against Adaptive CCA Attacks

• Intuition: want plaintext input to RSA to be all-or-nothing transform of actual message– e.g., so that multiplicative property over

ciphertexts doesn’t reveal message, and knowing one bit doesn’t reveal anything about whole message

• Desirable transform properties:– Randomness: unique plaintext for repeated

identical messages– Redundancy: make most strings invalid

ciphertexts– Entanglement: knowing partial information about

input to RSA should reveal nothing about message– Invertibility: of course, must be able to recover

original message when decrypting

41

Practical Padding for RSA:OAEP+ [Shoup]

• Transforms message M into RSA input M’• Not proven adaptive CCA secure, but heuristically so

42

Digital Signatures with RSA

• RSA trap-door permutation also useful for digital signatures

• Public-key signature operations:– Sign: S(K-1, m) {m}K

-1

– Verify: V(K, {m}K-1, m} {true, false}

• Provides integrity, like a MAC:– Cannot produce valid <m, {m}K

-1> pair without knowing K-1

• With RSA:– Sign using private key, using trap-door applied

when decrypting– Verify using public key, using permutation

applied when encrypting

43

Multiplicative Attack Against RSA Signatures

• As in CCA, attacker may try to exploit multiplicative relationship among RSA permutation inputs and outputs, to decrypt eavesdropped ciphertexts

• Eve stores ciphertext c encrypted for Alice, wants to recover corresponding m

• Using Alice’s public key, {n, e}, Eve:– Chooses random number r < n– Computes y = cre mod n– Eve asks Alice to sign y– Alice sends Eve yd mod n = cdred mod n = rcd

mod n– Eve computes r-1 mod n, then recovers

m = cd mod n = r-1rcd mod n

Lesson:Don’t sign whole messages presented to you by others!

44

Only Sign Message Hashes with RSA!

• Again, want all-or-nothing transform over message before signing with trap door

• Full-domain hash:– Before signing message, compute hash of

message sized to be same number of bits as RSA modulus n

– Sign the hash, not the message– Hash reveals nothing about underlying

message, nor messages arithmetically related to it

45

Costs of Cryptography

• Public-key operations significantly more computationally expensive than symmetric-key ones

• Modern CPU can symmetrically encrypt and MAC faster than 100 Mbps

• Public-key encryption typically 100X slower than symmetric crypto– This relationship changes as hardware

changes!

• Result: tend to use public-key encryption and signatures only on short messages

46

Hybrid Cryptography

• Goal: mix speed of symmetric-key flexibility of public-key cryptography

• Send symmetric key encrypted with public key; message encrypted with symmetric key

47

Pitfall: Public Key Provenance

• Suppose client wishes to know it’s talking to particular server

• Where does client get server’s public key?• How does client know it has correct public

key for real server, and not attacker?• Man-in-the-middle attack:

– Client connects to attacker– Attacker gives client attacker’s public key– Client believes communicating with real

server

48

Further Reading

• The MIT Guide to Picking Locks• Schneier, Bruce, Applied

Cryptography, 2nd ed.• Bleichenbacher, Daniel, Chosen

Ciphertext Attacks Against Protocols Based on the RSA Encryption Standard PKCS #1

User Authentication and Cryptographic Primitives

Documents

Transcript of User Authentication and Cryptographic Primitives