SPYING ON YOUR NEIGHBOR: CPU CACHE ATTACKS AND BEYOND · spying on your neighbor: cpu cache attacks...

Post on 23-Jul-2020

24 views 0 download

Transcript of SPYING ON YOUR NEIGHBOR: CPU CACHE ATTACKS AND BEYOND · spying on your neighbor: cpu cache attacks...

S P Y I N G O N Y O U R N E I G H B O R : C P U C A C H E AT TA C K S A N D B E Y O N D

B E N G R A S / @ B J G , K AV E H R A Z AV I , C R I S T I A N O G I U F F R I D A , H E R B E R T B O S V R I J E U N I V E R S I T E I T A M S T E R D A M B L A C K H AT U S A 2 0 1 8

A B O U T M E

A B O U T M E

• PhD student in VUsec VU University Research group

A B O U T M E

• PhD student in VUsec VU University Research group

• Academic group researching systems software security

A B O U T M E

• PhD student in VUsec VU University Research group

• Academic group researching systems software security

• We do software hardening, exploitation

A B O U T M E

• PhD student in VUsec VU University Research group

• Academic group researching systems software security

• We do software hardening, exploitation

• Hardware attacks, side channels

A B O U T M E

• PhD student in VUsec VU University Research group

• Academic group researching systems software security

• We do software hardening, exploitation

• Hardware attacks, side channels

• Academic recognition but also hacker scene (Pwnies!)

A B O U T M E

• PhD student in VUsec VU University Research group

• Academic group researching systems software security

• We do software hardening, exploitation

• Hardware attacks, side channels

• Academic recognition but also hacker scene (Pwnies!)

A B O U T M E

• PhD student in VUsec VU University Research group

• Academic group researching systems software security

• We do software hardening, exploitation

• Hardware attacks, side channels

• Academic recognition but also hacker scene (Pwnies!)

O V E R V I E W

• Side channels

• Cache attacks

• Cache defences

• Hyperthreading

• TLBleed

• Evaluation

S I D E C H A N N E L S

S I D E C H A N N E L S

S I D E C H A N N E L S

• Leak secrets outside the regular interface

S I D E C H A N N E L S

• Leak secrets outside the regular interface

R I C H H I S T O R Y - S M A R T C A R D S

• Power Consumption(FPGA Security by Shemal Shroff et al.)

• EM radiation: leak ECC bits(FPGA Security by Shemal Shroff et al.)

• Execution time: leak ECC, RSA bits(Timing Attacks on ECC by Shemal Shroff et al.)

• Acoustic cryptanalysis(RSA Key Extraction [..] by Adi Shamir et al.)

R I C H H I S T O R Y - S M A R T C A R D S

• Power Consumption(FPGA Security by Shemal Shroff et al.)

• EM radiation: leak ECC bits(FPGA Security by Shemal Shroff et al.)

• Execution time: leak ECC, RSA bits(Timing Attacks on ECC by Shemal Shroff et al.)

• Acoustic cryptanalysis(RSA Key Extraction [..] by Adi Shamir et al.)

R I C H H I S T O R Y - S M A R T C A R D S

• Power Consumption(FPGA Security by Shemal Shroff et al.)

• EM radiation: leak ECC bits(FPGA Security by Shemal Shroff et al.)

• Execution time: leak ECC, RSA bits(Timing Attacks on ECC by Shemal Shroff et al.)

• Acoustic cryptanalysis(RSA Key Extraction [..] by Adi Shamir et al.)

R I C H H I S T O R Y - S M A R T C A R D S

• Power Consumption(FPGA Security by Shemal Shroff et al.)

• EM radiation: leak ECC bits(FPGA Security by Shemal Shroff et al.)

• Execution time: leak ECC, RSA bits(Timing Attacks on ECC by Shemal Shroff et al.)

• Acoustic cryptanalysis(RSA Key Extraction [..] by Adi Shamir et al.)

R I C H H I S T O R Y - S M A R T C A R D S

• Power Consumption(FPGA Security by Shemal Shroff et al.)

• EM radiation: leak ECC bits(FPGA Security by Shemal Shroff et al.)

• Execution time: leak ECC, RSA bits(Timing Attacks on ECC by Shemal Shroff et al.)

• Acoustic cryptanalysis(RSA Key Extraction [..] by Adi Shamir et al.)

C A C H E AT TA C K S

C A C H E : S O F T W A R E E Q U I VA L E N T

• Computing processes ought to be compartmented

• Different owners or privilege levels: trust boundaries

C A C H E : S O F T W A R E E Q U I VA L E N T

• Computing processes ought to be compartmented

• Different owners or privilege levels: trust boundaries

C A C H E : S O F T W A R E E Q U I VA L E N T

• Computing processes ought to be compartmented

• Different owners or privilege levels: trust boundaries

C A C H E : S O F T W A R E E Q U I VA L E N T

• Computing processes ought to be compartmented

• Different owners or privilege levels: trust boundaries

C A C H E : S O F T W A R E E Q U I VA L E N T

• Computing processes ought to be compartmented

• Different owners or privilege levels: trust boundaries

C A C H E : S O F T W A R E E Q U I VA L E N T

• There are shared resources between processes

• RAM, CPU cache, TLB, computational resources ..

• Practically always: allows signaling: covert channel

• Sometimes: allows side channel (spying)

• Shared RAM row(DRAMA paper, by Peter Peßl et al.)

• Shared cache set (FLUSH+RELOAD by Yuval Yarom et al.shown, many others exist)

• Cache prefetch (Prefetch Side-Channel Attacks,by Daniel Gruss et al.)

C R O S S - P R O C E S S S H A R E D S TAT E

• Shared RAM row(DRAMA paper, by Peter Peßl et al.)

• Shared cache set (FLUSH+RELOAD by Yuval Yarom et al.shown, many others exist)

• Cache prefetch (Prefetch Side-Channel Attacks,by Daniel Gruss et al.)

C R O S S - P R O C E S S S H A R E D S TAT E

• Shared RAM row(DRAMA paper, by Peter Peßl et al.)

• Shared cache set (FLUSH+RELOAD by Yuval Yarom et al.shown, many others exist)

• Cache prefetch (Prefetch Side-Channel Attacks,by Daniel Gruss et al.)

C R O S S - P R O C E S S S H A R E D S TAT E

• Shared RAM row(DRAMA paper, by Peter Peßl et al.)

• Shared cache set (FLUSH+RELOAD by Yuval Yarom et al.shown, many others exist)

• Cache prefetch (Prefetch Side-Channel Attacks,by Daniel Gruss et al.)

1 2 3 4 5

11 12 13 14 15

6 7 8 9 10

16 17 18 19 20

21 22 23 24 25 26 27 28 29 30

C R O S S - P R O C E S S S H A R E D S TAT E

• Shared RAM row(DRAMA paper, by Peter Peßl et al.)

• Shared cache set (FLUSH+RELOAD by Yuval Yarom et al.shown, many others exist)

• Cache prefetch (Prefetch Side-Channel Attacks,by Daniel Gruss et al.)

C R O S S - P R O C E S S S H A R E D S TAT E

• Shared RAM row(DRAMA paper, by Peter Peßl et al.)

• Shared cache set (FLUSH+RELOAD by Yuval Yarom et al.shown, many others exist)

• Cache prefetch (Prefetch Side-Channel Attacks,by Daniel Gruss et al.)

C R O S S - P R O C E S S S H A R E D S TAT E

• Shared RAM row(DRAMA paper, by Peter Peßl et al.)

• Shared cache set (FLUSH+RELOAD by Yuval Yarom et al.shown, many others exist)

• Cache prefetch (Prefetch Side-Channel Attacks,by Daniel Gruss et al.)

C R O S S - P R O C E S S S H A R E D S TAT E

C R O S S - P R O C E S S / V M S H A R E D S TAT E

• This is only possible because of shared resources

C R O S S - P R O C E S S / V M S H A R E D S TAT E

• This is only possible because of shared resources

E X A M P L E : F L U S H + R E L O A D

• One of several cache attacks

• Relies on shared memory

• Can be shared object (mmap()ed shared libraries)

• Or shared pages after deduplication (KSM in Linux)

E X A M P L E : F L U S H + R E L O A D

• Work by Yuval Yarom, Katrina Falkner

• Memory access patterns can betray secrets

• Because access patterns frequently depend on secrets

• Example: RSA keys. (n,e,d) Private: d. n=pq and d are 1024 bits or more

E X A M P L E : F L U S H + R E L O A D

• Signing is: computing md (mod n)

• Often square-and-multiply depending on bits in d

• Shared cache activity betrays memory access patterns

• Quickly probing the cache can betray the bits in d

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

E X A M P L E : F L U S H + R E L O A D

• Can also attack AES implementation with T tables

• A table lookup happens Tj [xi = pi ⊕ ki ]

• pi is a plaintext byte, ki a key byte where pi is a plaintext byte, ki a key byte,

E X A M P L E : F L U S H + R E L O A D

• Again: secrets are betrayed by memory accesses

• Known plaintext + accesses = key recovery where pi is a plaintext byte, ki a key byte,

E X A M P L E : F L U S H + R E L O A D

• Again: secrets are betrayed by memory accesses

• Known plaintext + accesses = key recovery where pi is a plaintext byte, ki a key byte,

C A C H E D E F E N C E S

C A C H E C O L O U R I N G

• Figure out page colors

• These map to shared cache sets

• Do not share same colors across security boundaries

• Kernel arranges this

C A C H E C O L O U R I N G

• Figure out page colors

• These map to shared cache sets

• Do not share same colors across security boundaries

• Kernel arranges this

1 2 3 4 5

11 12 13 14 15

21 22 23 24 25

C A C H E C O L O U R I N G

• Figure out page colors

• These map to shared cache sets

• Do not share same colors across security boundaries

• Kernel arranges this

1 2 3 4 5

11 12 13 14 15

21 22 23 24 25

C A C H E C O L O U R I N G

• Figure out page colors

• These map to shared cache sets

• Do not share same colors across security boundaries

• Kernel arranges this

1 2 3 4 5

11 12 13 14 15

21 22 23 24 25

C A C H E C O L O U R I N G

• Figure out page colors

• These map to shared cache sets

• Do not share same colors across security boundaries

• Kernel arranges this

1 2 3 4 5

11 12 13 14 15

21 22 23 24 25

6 7 8 9 10

16 17 18 19 20

26 27 28 29 30

C A C H E C O L O U R I N G

• Figure out page colors

• These map to shared cache sets

• Do not share same colors across security boundaries

• Kernel arranges this

1 2 3 4 5

11 12 13 14 15

21 22 23 24 25

6 7 8 9 10

16 17 18 19 20

26 27 28 29 30

C A C H E C O L O U R I N G

• Figure out page colors

• These map to shared cache sets

• Do not share same colors across security boundaries

• Kernel arranges this

1 2 3 4 5

11 12 13 14 15

21 22 23 24 25

6 7 8 9 10

16 17 18 19 20

26 27 28 29 30

C A C H E C O L O U R I N G

• Figure out page colors

• These map to shared cache sets

• Do not share same colors across security boundaries

• Kernel arranges this

1 2 3 4 5

11 12 13 14 15

21 22 23 24 25

6 7 8 9 10

16 17 18 19 20

26 27 28 29 30

C A C H E C O L O U R I N G

• Figure out page colors

• These map to shared cache sets

• Do not share same colors across security boundaries

• Kernel arranges this

1 2 3 4 5

11 12 13 14 15

21 22 23 24 25

6 7 8 9 10

16 17 18 19 20

26 27 28 29 30

C A C H E PA R T I T I O N I N G : C AT

• Intel CAT: Cache Allocation Technology

• Intended for predictable performance for VMs

• Partitions caches in ways

• Hardware feature

C A C H E PA R T I T I O N I N G : C AT

• Intel CAT: Cache Allocation Technology

• Intended for predictable performance for VMs

• Partitions caches in ways

• Hardware feature

1 2 3 4 5

11 12 13 14 15

6 7 8 9 10

16 17 18 19 20

21 22 23 24 25 26 27 28 29 30

C A C H E PA R T I T I O N I N G : C AT

• Intel CAT: Cache Allocation Technology

• Intended for predictable performance for VMs

• Partitions caches in ways

• Hardware feature

1 2 3 4 5

11 12 13 14 15

6 7 8 9 10

16 17 18 19 20

21 22 23 24 25 26 27 28 29 30

C A C H E PA R T I T I O N I N G : C AT

• Intel CAT: Cache Allocation Technology

• Intended for predictable performance for VMs

• Partitions caches in ways

• Hardware feature

1 2 3 4 5

11 12 13 14 15

6 7 8 9 10

16 17 18 19 20

21 22 23 24 25 26 27 28 29 30

C A C H E PA R T I T I O N I N G : C AT

• Intel CAT: Cache Allocation Technology

• Intended for predictable performance for VMs

• Partitions caches in ways

• Hardware feature

1 2 3 4 5

11 12 13 14 15

6 7 8 9 10

16 17 18 19 20

21 22 23 24 25 26 27 28 29 30

C A C H E PA R T I T I O N I N G : C AT

• Intel CAT: Cache Allocation Technology

• Intended for predictable performance for VMs

• Partitions caches in ways

• Hardware feature

1 2 3 4 5

11 12 13 14 15

6 7 8 9 10

16 17 18 19 20

21 22 23 24 25 26 27 28 29 30

C A C H E PA R T I T I O N I N G : C AT

• Intel CAT: Cache Allocation Technology

• Intended for predictable performance for VMs

• Partitions caches in ways

• Hardware feature

1 2 3 4 5

11 12 13 14 15

6 7 8 9 10

16 17 18 19 20

21 22 23 24 25 26 27 28 29 30

C A C H E PA R T I T I O N I N G : C AT

• Intel CAT: Cache Allocation Technology

• Intended for predictable performance for VMs

• Partitions caches in ways

• Hardware feature

1 2 3 4 5

11 12 13 14 15

6 7 8 9 10

16 17 18 19 20

21 22 23 24 25 26 27 28 29 30

C A C H E PA R T I T I O N I N G : T S X

C A C H E PA R T I T I O N I N G : T S X• Intel TSX: Transactional Synchronization Extensions

C A C H E PA R T I T I O N I N G : T S X• Intel TSX: Transactional Synchronization Extensions

• Intended for hardware transactional memory

C A C H E PA R T I T I O N I N G : T S X• Intel TSX: Transactional Synchronization Extensions

• Intended for hardware transactional memory

• But relies on unshared cache activity

C A C H E PA R T I T I O N I N G : T S X• Intel TSX: Transactional Synchronization Extensions

• Intended for hardware transactional memory

• But relies on unshared cache activity

• Transactions fit in cache, otherwise auto-abort

C A C H E PA R T I T I O N I N G : T S X• Intel TSX: Transactional Synchronization Extensions

• Intended for hardware transactional memory

• But relies on unshared cache activity

• Transactions fit in cache, otherwise auto-abort

• We can use this as a defence

C A C H E PA R T I T I O N I N G : T S X• Intel TSX: Transactional Synchronization Extensions

• Intended for hardware transactional memory

• But relies on unshared cache activity

• Transactions fit in cache, otherwise auto-abort

• We can use this as a defence

C A C H E PA R T I T I O N I N G : T S X• Intel TSX: Transactional Synchronization Extensions

• Intended for hardware transactional memory

• But relies on unshared cache activity

• Transactions fit in cache, otherwise auto-abort

• We can use this as a defence

H Y P E R T H R E A D I N G

S U P E R S C A L A R C P U U T I L I S AT I O N

S U P E R S C A L A R C P U U T I L I S AT I O N

• Share functional units (FU): increase utilisation

S U P E R S C A L A R C P U U T I L I S AT I O N

• Share functional units (FU): increase utilisation

S U P E R S C A L A R C P U U T I L I S AT I O N

• Share functional units (FU): increase utilisation

S U P E R S C A L A R C P U U T I L I S AT I O N

• Share functional units (FU): increase utilisation

S U P E R S C A L A R C P U U T I L I S AT I O N

• Share functional units (FU): increase utilisation

S U P E R S C A L A R C P U U T I L I S AT I O N

• Share functional units (FU): increase utilisation

H Y P E R T H R E A D I N G

H Y P E R T H R E A D I N G

• Low investment, high utilisation yield

H Y P E R T H R E A D I N G

H Y P E R T H R E A D I N G

• Significant resource sharing: good and bad

T L B L E E D

T L B L E E D : T L B

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E

• Other structures than cache shared between threads?

• What about the TLB?

• Documented: TLB has L1iTLB, L1dTLB, and L2TLB

• They have sets and ways

• Not documented: structure

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s experiment with performance counters

• Try linear structure first

• All combinations of ways (set size) and sets (stride)

• Smallest number of ways is it

• Smallest corresponding stride is number of sets

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s experiment with performance counters

• Try linear structure first

• All combinations of ways (set size) and sets (stride)

• Smallest number of ways is it

• Smallest corresponding stride is number of sets

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s experiment with performance counters

• Try linear structure first

• All combinations of ways (set size) and sets (stride)

• Smallest number of ways is it

• Smallest corresponding stride is number of sets

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E• For L2TLB:

We reverse engineered a more complex hash function

T L B L E E D : T L B A S S H A R E D S TAT E• For L2TLB:

We reverse engineered a more complex hash function

• Skylake XORs 14 bits, Broadwell XORs 16 bits

T L B L E E D : T L B A S S H A R E D S TAT E• For L2TLB:

We reverse engineered a more complex hash function

• Skylake XORs 14 bits, Broadwell XORs 16 bits

• Represented by this matrix, using modulo 2 arithmetic

T L B L E E D : T L B A S S H A R E D S TAT E• For L2TLB:

We reverse engineered a more complex hash function

• Skylake XORs 14 bits, Broadwell XORs 16 bits

• Represented by this matrix, using modulo 2 arithmetic

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s experiment with performance counters

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s experiment with performance counters

• Now we know the structure.. Are TLB’s shared between hyperthreads?

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s experiment with performance counters

• Now we know the structure.. Are TLB’s shared between hyperthreads?

• Let’s experiment with misses when accessing the same set

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s experiment with performance counters

• Now we know the structure.. Are TLB’s shared between hyperthreads?

• Let’s experiment with misses when accessing the same set

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s experiment with performance counters

• Now we know the structure.. Are TLB’s shared between hyperthreads?

• Let’s experiment with misses when accessing the same set

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s experiment with performance counters

• Now we know the structure.. Are TLB’s shared between hyperthreads?

• Let’s experiment with misses when accessing the same set

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E• We find more TLB properties

• Size, structure, sharing, miss penalty, hash function

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E

• Can we use only latency?

• Map many virtual addresses to same physical page

T L B L E E D : T L B A S S H A R E D S TAT E

• Can we use only latency?

• Map many virtual addresses to same physical page

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s observe EdDSA ECC key multiplication

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s observe EdDSA ECC key multiplication

• Scalar is secret and ADD only happens if there’s a 1

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s observe EdDSA ECC key multiplication

• Scalar is secret and ADD only happens if there’s a 1

• Like RSA square-and-multiply

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s observe EdDSA ECC key multiplication

• Scalar is secret and ADD only happens if there’s a 1

• Like RSA square-and-multiply

• But: we can not use code information! Only data..!

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E

• Typical cache attack relies on spatial separation

• I.E. different cache lines

T L B L E E D : T L B A S S H A R E D S TAT E

• Typical cache attack relies on spatial separation

• I.E. different cache lines

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s find the spatial L1 DTLB separation

• There isn’t any

• Too much activity in both blue/green cases

T L B L E E D : T L B A S S H A R E D S TAT E• Let’s find the spatial L1 DTLB separation

• There isn’t any

• Too much activity in both blue/green cases

T L B L E E D : T L B A S S H A R E D S TAT E

T L B L E E D : T L B A S S H A R E D S TAT E• Monitor a single TLB set and use temporal information

T L B L E E D : T L B A S S H A R E D S TAT E• Monitor a single TLB set and use temporal information

• Use machine learning (SVM classifier) to tell the difference

T L B L E E D : T L B A S S H A R E D S TAT E• Monitor a single TLB set and use temporal information

• Use machine learning (SVM classifier) to tell the difference

T L B L E E D : T L B A S S H A R E D S TAT E• Monitor a single TLB set and use temporal information

• Use machine learning (SVM classifier) to tell the difference

T L B L E E D : T L B A S S H A R E D S TAT E• Monitor a single TLB set and use temporal information

• Use machine learning (SVM classifier) to tell the difference

T L B L E E D : T L B A S S H A R E D S TAT E• Monitor a single TLB set and use temporal information

• Use machine learning (SVM classifier) to tell the difference

T L B L E E D : T L B A S S H A R E D S TAT E• Monitor a single TLB set and use temporal information

• Use machine learning (SVM classifier) to tell the difference

E VA L U AT I O N

T L B L E E D R E L I A B I L I T Y

T L B L E E D R E L I A B I L I T Y

T L B L E E D R E L I A B I L I T Y

T L B L E E D R E L I A B I L I T Y

• Work also by Kaveh Razavi, Cristiano Giuffrida, Herbert Bos

• Some diagrams in these slides were taken from other work: FLUSH+RELOAD, Prefetch, DRAMA

• Yuval Yarom, Katrina Falkner, Peter Peßl, Daniel Gruss

C R E D I T

C O N C L U S I O N

• Practical, reliable, high resolution side channels exist outside the cache

C O N C L U S I O N

• Practical, reliable, high resolution side channels exist outside the cache

• They bypass defences

C O N C L U S I O N

• Practical, reliable, high resolution side channels exist outside the cache

• They bypass defences

• @bjg @gober

C O N C L U S I O N

• Practical, reliable, high resolution side channels exist outside the cache

• They bypass defences

• @bjg @gober

• @vu5ec

C O N C L U S I O N

• Practical, reliable, high resolution side channels exist outside the cache

• They bypass defences

• @bjg @gober

• @vu5ec

• www.vusec.net

C O N C L U S I O N

• Practical, reliable, high resolution side channels exist outside the cache

• They bypass defences

• @bjg @gober

• @vu5ec

• www.vusec.net

• Thank you for listening

C O N C L U S I O N