ITIS 3200: Introduction to Information Security and Privacy
-
Upload
bruce-sosa -
Category
Documents
-
view
34 -
download
0
description
Transcript of ITIS 3200: Introduction to Information Security and Privacy
ITIS 3200:Introduction to Information
Security and Privacy
Dr. Weichao Wang
2
Chapter 8: Basic Cryptography
• Classical Cryptography
• Public Key Cryptography
• Cryptographic Checksums
3
Overview
• Classical Cryptography (symmetric encryption)– Caesar cipher– Vigenère cipher– DES
• Public Key Cryptography (asymmetric encryption)– RSA
• Cryptographic Checksums
4
Cryptosystem
• Quintuple (E, D, M, K, C)– M set of plaintexts– K set of keys– C set of ciphertexts– E set of encryption functions e: M K C– D set of decryption functions d: C K M– Usually, the cipher text is longer than or has the
same length as the plaintext, why?
5
Example
• Example: Caesar cipher (a circular shift mapping)– M = { sequences of letters }– K = { i | i is an integer and 0 ≤ i ≤ 25 }
– E = { Ek | k K and for all letters m,
Ek(m) = (m + k) mod 26 }
– D = { Dk | k K and for all letters c,
Dk(c) = (26 + c – k) mod 26 }
– From the space point of view, C = M
6
Attacks
• Opponent whose goal is to break cryptosystem is the adversary– Assume adversary knows algorithm used, but not key– They are after the key and/or data contents
• Three types of attacks:– ciphertext only: adversary has only ciphertext; goal is
to find plaintext, possibly key– known plaintext: adversary has ciphertext,
corresponding plaintext; goal is to find key– chosen plaintext: adversary may supply plaintexts and
obtain corresponding ciphertext; goal is to find key– It is not difficult to know both ciphertext and plaintext
7
Basis for Attacks
• Mathematical attacks– Based on analysis of underlying mathematics– For example: the safety of RSA is based on the
difficulty to factor the product of two large prime numbers. If it is broken, the RSA will become unsafe.
• Statistical attacks– Make assumptions about the distribution of letters,
pairs of letters (digrams), triplets of letters (trigrams), etc.
• Called models of the language– Examine ciphertext, correlate properties with the
assumptions.
8
Classical Cryptography (symmetric)
• Sender, receiver share common key– Keys may be the same, or trivial to derive one
from the other– Sometimes called symmetric cryptography– How to distribute keys: following chapters
• Two basic types– Transposition ciphers (shuffle the order)– Substitution ciphers (replacement)– Combinations are called product ciphers
9
Transposition Cipher
• Rearrange letters in plaintext to produce ciphertext
• Example (Rail-Fence Cipher)– Plaintext is HELLO WORLD– Rearrange as
HLOOLELWRD
– Ciphertext is HLOOL ELWRD
10
Attacking the Cipher
• Anagramming– The frequency of the characters will not
change– The key is a permutation function– If 1-gram frequencies match English
frequencies, but other n-gram frequencies do not, probably transposition
– Rearrange letters to form n-grams with highest frequencies
11
Example
• Ciphertext: HLOOLELWRD• Frequencies of 2-grams beginning with H
– HE 0.0305– HO 0.0043– HL, HW, HR, HD < 0.0010
• Frequencies of 2-grams ending in H– WH 0.0026– EH, LH, OH, RH, DH ≤ 0.0002
• Implies E follows H
12
Example
• Arrange so the H and E are adjacentHELLOWORLD
• Read off across, then down, to get original plaintext
• May need to try different combinations or n-grams
13
• Under many conditions, transposition cipher alone is not enough:– Example: is it “cat” or “act”– Example: I will come, she will not.
14
Substitution Ciphers
• Change characters in plaintext to produce ciphertext– The replacement rule can be fixed or keep changing
• Example (Caesar cipher)– Plaintext is HELLO WORLD– Change each letter to the third letter after it in the
alphabet (A goes to D, ---, X goes to A, Y to B, Z to C)• Key is 3, usually written as letter ‘D’
– Ciphertext is KHOOR ZRUOG
15
Attacking the Cipher
• Exhaustive search– If the key space is small enough, try all
possible keys until you find the right one– Caesar cipher has 26 possible keys– On average, you only need to try half of them
• Statistical analysis– The ciphertext has a character frequency– Compare (and try to align it) to 1-gram model
of English
16
Statistical Attack
• Compute frequency of each letter in ciphertext:
G 0.1 H 0.1 K 0.1 O 0.3
R 0.2 U 0.1 Z 0.1
• Apply 1-gram model of English– Frequency of characters (1-grams) in English
is on next slide
17
Character Frequencies
a 0.080 h 0.060 n 0.070 t 0.090
b 0.015 i 0.065 o 0.080 u 0.030
c 0.030 j 0.005 p 0.020 v 0.010
d 0.040 k 0.005 q 0.002 w 0.015
e 0.130 l 0.035 r 0.065 x 0.005
f 0.020 m 0.030 s 0.060 y 0.020
g 0.015 z 0.002
18
0
2
4
6
8
10
12
14
1 3 5 7 9 11 13 15 17 19 21 23 25
0
5
10
15
20
25
30
35
1 3 5 7 9 11 13 15 17 19 21 23 25
Frequency chart of English
Frequency chart of ciphertext
19
Statistical Analysis
• Try to align the two charts• f(c): frequency of character c in ciphertext (i): correlation of frequency of letters in
ciphertext with corresponding letters in English, assuming key is i (i) = 0 ≤ c ≤ 25 f(c)p(c – i) (i) = 0.1p(6 – i) + 0.1p(7 – i) + 0.1p(10 – i) + 0.3p(14
– i) + 0.2p(17 – i) + 0.1p(20 – i) + 0.1p(25 – i)• p(x) is frequency of character x in English
– This correlation value should be a maximum when the two charts are aligned
20
Correlation: (i) for 0 ≤ i ≤ 25
i (i) i (i) i (i) i (i)0 0.0482 7 0.0442 13 0.0520 19 0.0315
1 0.0364 8 0.0202 14 0.0535 20 0.0302
2 0.0410 9 0.0267 15 0.0226 21 0.0517
3 0.0575 10 0.0635 16 0.0322 22 0.0380
4 0.0252 11 0.0262 17 0.0392 23 0.0370
5 0.0190 12 0.0325 18 0.0299 24 0.0316
6 0.0660 25 0.0430
21
The Result
• Most probable keys, based on :– i = 6, (i) = 0.0660
• plaintext EBIIL TLOLA– i = 10, (i) = 0.0635
• plaintext AXEEH PHKEW– i = 3, (i) = 0.0575
• plaintext HELLO WORLD– i = 14, (i) = 0.0535
• plaintext WTAAD LDGAS
• Only English phrase is for i = 3– That’s the key (3 or ‘D’)
• The algorithm will become more complicated if the encryption is not a circular shift
• We are only reducing the searching space by guided guess
22
Caesar’s Problem
• Key is too short– Can be found by exhaustive search– Statistical frequencies not concealed well
• They look too much like regular English letters– The most frequently seen character could be “E”
• So make it longer– Multiple letters in key– Idea is to smooth the statistical frequencies to make
cryptanalysis harder• What if in the cipher text, every character has the same
frequency?• All correlation will have the same value
23
• What we can get from this:– If the key contains only 1 character, we have
26 choices– If the key contains m characters, we need to
try 26 ^ m combinations. m=5, 12 million choices.
24
Vigenère Cipher
• Like Caesar cipher, but use a longer key• Example
– Message THE BOY HAS THE BALL– Key VIG (right shift 21, 8, 6 times, then start
again)– Encipher using Caesar cipher for each letter:
key VIG VIG VIG VIG VIGVplain THE BOY HAS THE BALLcipher OPK WWE CIY OPK WIRG
25
• Now we have:– Three groups of characters encrypted by V, I, and G
respectively– Each group is a Caesar cipher (fixed shift offset)– Approach:
• Figure out the length of the key• Decipher each group as a Caesar’s cipher
– Hint: when the same plaintext is encrypted by the same segment of key, we have the same cipher
• Can be used to derive the period
26
Attacking the Cipher
• Approach– Establish period; call it n– Break message into n parts, each part being
enciphered using the same key letter• So each part is like a Caesar cipher
– Solve each part• You can leverage one part from another because
English has its special rule
27
Establish Period
• Kasiski: repetitions in the ciphertext occur when characters of the key appear over the same characters in the plaintext
• Example:key VIG VIG VIG VIG VIGVplain THE BOY HAS THE BALLcipher OPK WWE CIY OPK WIRG
Note the key and plaintext line up over the repetitions (underlined). As distance between repetitions is 9, the period is a factor of 9 (that is, 1, 3, or 9)
28
• We guess the period is 3, so we divide the cipher text into three groups– Each group is a Caesar cipher– Using previous approaches to solve them
29
One-Time Pad
• A Vigenère cipher with a random key at least as long as the message– Provably unbreakable– Why? Every character in the key is random. It has the
same chance to be “a”, “b”, ---, “z”– Look at ciphertext DXQR. Equally likely to correspond
to plaintext DOIT (key AJIY) and to plaintext DONT (key AJDY) and any other 4-letter words
– Warning: keys must be random, or you can attack the cipher by trying to regenerate the key
30
Overview of the DES
• A block cipher:– encrypts blocks of 64 bits using a 64 bit key– outputs 64 bits of ciphertext
• A product cipher– basic unit is the bit– performs both substitution and transposition
(permutation) on the bits
• Cipher consists of 16 rounds (iterations), each with a 48-bit round key generated from the 64-bit key
31
Generation of Round Keys
key
PC-1
C0 D0
LSH LSH
D1
PC-2 K1
K16LSH LSH
C1
PC-2
• Round keys are 48 bits each
32
Encipherment
input
IP
L0 R0
f K1
L1 = R0 R1 = L0 f(R0, K1)
R16 = L15 f (R15, K16) L16 = R15
IPŠ1
output
33
The f Function
RiŠ1 (32 bits)
E
RiŠ1 (48 bits)
Ki (48 bits)
S1 S2 S3 S4 S5 S6 S7 S8
6 bits into each
P
32 bits
4 bits out of each
34
• S-Box– There are eight S-Box, each maps 6-bit input
to 4-bit output– Each S-Box is a look-up table– This is the only non-linear step in DES and
contributes the most to its safety
• P-Box– A permutation
35
Controversy
• Considered too weak– Diffie, Hellman said “in a few years
technology would allow DES to be broken in days”
• DES Challenge organized by RSA• In 1997, solved in 96 days; 41 days in early 1998;
56 hours in late 1998; 22 hours in Jan 1999• http://w2.eff.org/Privacy/Crypto/Crypto_misc/DESC
racker/HTML/19990119_deschallenge3.html
– Design decisions not public• S-boxes may have backdoors
36
Undesirable Properties
• 4 weak keys– They are their own inverses
• 12 semi-weak keys– Each has another semi-weak key as inverse
• Complementation property– DESk(m) = c DESk(m) = c
• S-boxes exhibit irregular properties– Distribution of odd, even numbers non-random– Outputs of fourth box depends on input to third box
37
• Number of rounds– After 5 rounds, every cipher bit is impacted by
every plaintext bit and key bit– After 8 rounds, cipher text is already a
random function– When the number of rounds is 16 or more,
brute force attack will be the most efficient attack for known plaintext attack
– So NSA knows a lot when it fixes the DES
38
Differential Cryptanalysis
• A chosen ciphertext attack– Requires 247 (plaintext, ciphertext) pairs
• Revealed several properties– Small changes in S-boxes reduce the number of
(plaintext, ciphertext) pairs needed– Making every bit of the round keys independent does
not impede attack
• Linear cryptanalysis improves result– Requires 243 (plaintext, ciphertext) pairs
39
DES Modes
• Electronic Code Book Mode (ECB)– Encipher each block independently
• Cipher Block Chaining Mode (CBC)– Xor each plaintext block with previous ciphertext block– Requires an initialization vector for the first one– The initialization vector can be made public
• Encrypt-Decrypt-Encrypt Mode (2 keys: k, k)• Encrypt-Encrypt-Encrypt Mode (3 keys: k, k, k)
40
CBC Mode Encryption
init. vector m1
DES
c1
m2
DES
c2
sent sent
…
…
…
41
CBC Mode Decryption
init. vector c1
DES
m1
…
…
…
c2
DES
m2
42
Self-Healing Property
• What will happen if a bit gets lost during transmission?– All blocks will not be aligned
• When one bit in a block flipped, only the next two blocks will be impacted.– Plaintext “heals” after 2 blocks
43
Current Status of DES
• Design for computer system, associated software that could break any DES-enciphered message in a few days published in 1998
• Several challenges to break DES messages solved using distributed computing
• NIST selected Rijndael as Advanced Encryption Standard, successor to DES– Designed to withstand attacks that were successful
on DES