CONTEXT-SENSITIVE FENCING
Transcript of CONTEXT-SENSITIVE FENCING
CONTEXT-SENSITIVE FENCING: SECURING SPECULATIVE EXECUTION VIA MICROCODE CUSTOMIZATION Mohammadkazem Taram, Ashish Venkat, Dean Tullsen University of California San Diego, University of Virginia
PERFORMANCE V.S. SECURITY
�2
Security Performance
PERFORMANCE V.S. SECURITY
�2
Security Performance
PERFORMANCE V.S. SECURITY
�2
Security PerformanceFORESHADOW
SPECTRE ATTACKS!
�3
Side-Channels
AttackerSpeculative Execution
Victim
➤ Leak secrets via side-channels + speculative execution
➤ Any modern processor with a Branch Predictor is vulnerable
y = array2[array1[x] * 64];
SPECTRE V1 — BOUNDS CHECK BYPASS
�4
int Kernel_api_( int x ){
}
if ( x < array1_size) //bounds check y = array2[array1[x] * 64];
SPECTRE V1 — BOUNDS CHECK BYPASS
�4
int Kernel_api_( int x ){
}
if ( x < array1_size) //bounds check y = array2[array1[x] * 64];
SPECTRE V1 — BOUNDS CHECK BYPASS
�4
int Kernel_api_( int x ){
}
Mispredicted//not taken/fallthrough code
if ( x < array1_size) //bounds check y = array2[array1[x] * 64];
SPECTRE V1 — BOUNDS CHECK BYPASS
�4
int Kernel_api_( int x ){
}
Too late to recover — data is exposed via side-channels
Mispredicted//not taken/fallthrough code
➤ Restricting Speculation Using Fences and Barriers:
CURRENT SPECTRE V1 MITIGATIONS
�5
➤ Restricting Speculation Using Fences and Barriers:
CURRENT SPECTRE V1 MITIGATIONS
�5
if ( x < array1_size) y = array2[array1[x] * 64];
➤ Restricting Speculation Using Fences and Barriers:
CURRENT SPECTRE V1 MITIGATIONS
�5
if ( x < array1_size) speculative_fence; y = array2[array1[x] * 64];
➤ Restricting Speculation Using Fences and Barriers:
CURRENT SPECTRE V1 MITIGATIONS
�5
if ( x < array1_size) speculative_fence; y = array2[array1[x] * 64];
Up to 10x Performance Overhead!
O. Oleksenko, B. Trach, T. Reiher, M. Silberstein, and C.Fetzer. 2018. You Shall Not Bypass
THIS WORK : CONTEXT SENSITIVE FENCING
➤ Surgically injects fence micro-ops
�6
THIS WORK : CONTEXT SENSITIVE FENCING
➤ Surgically injects fence micro-ops
�6
Only When Necessary
THIS WORK : CONTEXT SENSITIVE FENCING
➤ Surgically injects fence micro-ops
�6
Right Type of FenceOnly When Necessary
THIS WORK : CONTEXT SENSITIVE FENCING
➤ Surgically injects fence micro-ops
�6
Right Type of FenceOnly When Necessary No Recompilation
MICRO-OP STREAM CUSTOMIZATION BY CONTEXT-SENSITIVE DECODING
�7
Native Instructions (e.g., inc [0x803ac] )
Instruction Decoder
Fetch
Rename
Execute
WB
ld t0, [0x803ac]
add, t0, t0, 1
st [0x803ac], t0
“Context-Sensitive Decoding: On-Demand Microcode Customization for Security and Energy Management” ISCA 2018, IEEE Micro Top Picks 2019
MICRO-OP STREAM CUSTOMIZATION BY CONTEXT-SENSITIVE DECODING
�7
Native Instructions (e.g., inc [0x803ac] )
Instruction Decoder
Fetch
FENCE
FENCE
Rename
Execute
WB
ld t0, [0x803ac]
add, t0, t0, 1
st [0x803ac], t0
“Context-Sensitive Decoding: On-Demand Microcode Customization for Security and Energy Management” ISCA 2018, IEEE Micro Top Picks 2019
CONTEXT-SENSITIVE FENCING: AN EXAMPLE
�8
Micro-op Fusion
Regular Decoders
MSROM
Decoders
Context-Sensitive
Model Specific
RegistersOS
Macro-op Dispatcher
X86 Instructions
Fence Enforcement Configurations
Micro-op Queue
beqadd
CONTEXT-SENSITIVE FENCING: AN EXAMPLE
�8
Micro-op Fusion
Regular Decoders
MSROM
Decoders
Context-Sensitive
Model Specific
RegistersOS
Macro-op Dispatcher
X86 Instructions
inc [0x803ac]
Fence Enforcement Configurations
Micro-op Queue
beqadd
(*ptr)++
CONTEXT-SENSITIVE FENCING: AN EXAMPLE
�8
Micro-op Fusion
Regular Decoders
MSROM
Decoders
Context-Sensitive
Model Specific
RegistersOS
Macro-op Dispatcher
X86 Instructions
inc [0x803ac]
Fence Enforcement Configurations
Taint Tracking
Micro-op Queue
beqadd
(*ptr)++
CONTEXT-SENSITIVE FENCING: AN EXAMPLE
�8
Micro-op Fusion
Regular Decoders
MSROM
Decoders
Context-Sensitive
Model Specific
RegistersOS
Macro-op Dispatcher
X86 Instructionsinc [0x803ac]
Fence Enforcement Configurations
Taint Tracking
____FENCE___ ld t0,[0x803ac] add t0, t0, 1 st [0x803ac], t0
Micro-op Queue
beqadd
CONTEXT-SENSITIVE FENCING: AN EXAMPLE
�9
Micro-op Fusion
Regular Decoders
MSROM
Decoders
Context-Sensitive
Model Specific
RegistersOS
Macro-op Dispatcher
X86 Instructionsinc [0x803ac]
Fence Enforcement Configurations
Taint Tracking
____FENCE___ ld t0,[0x803ac] add t0, t0, 1 st [0x803ac], t0
Micro-op Queuest
addld
FENCEbeqadd
CONTEXT SENSITIVE FENCING
➤ Surgically injects fence micro-ops
�10
Right Type of Fence Only When NecessaryNo Recompilation
BUT WHAT FENCE SHOULD WE USE?
�11
Existing Intel Fences
BUT WHAT FENCE SHOULD WE USE?
�11
Existing Intel Fences
Require Privileged Access
Clobber Architectural Registers
Enforced Early in the Pipeline
EXISTING FENCES: SERIALIZING INSTRUCTIONS (SI)
�12
Instruction Decoder
Fetch
Rename
FU1
Com
mit
Dispatch
FU1
FU1
…
➤ Enforced early in the pipeline
➤ Examples:
➤ All Serializing Instructions
➤ Intel’s MFENCE
➤ Intel’s SFENCE
EXISTING FENCES: INTEL LFENCE
�13
Instruction Decoder
Fetch
Rename
FU1
Com
mit
Dispatch
FU1
FU1
…
➤ Enforced early in the pipeline
➤ Example:
➤ Intel’s LFENCE
LATE ENFORCEMENT FENCES
�14
➤ Shifts fence enforcement towards the leaking structure
➤ Reduces the impact on other instructions µ-op
Que
ue
Macro-op Stream Macro-to-µop
Translation Unit
µ-op
Dis
patc
her
Load Store-Queue
FU1 FU2
RSRS
RSRSRS
Cache
Cac
he
Con
trolle
r
Mem
Mem
C
ontro
ller
RS
LATE ENFORCEMENT FENCES
�14
➤ Shifts fence enforcement towards the leaking structure
➤ Reduces the impact on other instructions µ-op
Que
ue
Macro-op Stream Macro-to-µop
Translation Unit
µ-op
Dis
patc
her
Load Store-Queue
FU1 FU2
RSRS
RSRSRS
Cache
Cac
he
Con
trolle
r
Mem
Mem
C
ontro
ller
RS
LATE ENFORCEMENT FENCES
�14
➤ Shifts fence enforcement towards the leaking structure
➤ Reduces the impact on other instructions µ-op
Que
ue
Macro-op Stream Macro-to-µop
Translation Unit
µ-op
Dis
patc
her
Load Store-Queue
FU1 FU2
RSRS
RSRSRS
Cache
Cac
he
Con
trolle
r
Mem
Mem
C
ontro
ller
RS
NEWLY PROPOSED FENCES
➤ Load-Store Queue LFENCE (LSQ-LFENCE)
➤ Load-Store Queue MFENCE (LSQ-MFENCE)
➤ Reservation Station Fence (RSFENCE)
➤ Cache Fence (CFENCE)
µ-op
Que
ue
Macro-op Stream Macro-to-µop
Translation Unit
µ-op
Dis
patc
her
Load Store-Queue
FU1 FU2
RSRS
RSRSRS
Cache
Cac
he
Con
trolle
r
Mem
Mem
C
ontro
ller
RS
LSQ-Fences
RS-Fence
Cfence
�15
ld
CACHE FENCE (CFENCE)
�16
CFENCE
ld
ld
Non-Modifying
Non-Modifying
➤ Allows all the load and stores to pass
➤ CFENCE labels any subsequent load as a non-modifying load
➤ allows non-modifying loads to pass through the CFENCE
➤ Non-modifying loads are restricted from modifying the cache state.
Non-Modifying
CACHE FENCE (CFENCE)
�17
➤ if a cache hit -> read the contents of the cache,
➤ doesn’t change meta-data (e.g., LRU age bits)
➤ if a cache miss -> mark as uncacheable
➤ allow mem read without altering the cache state
Normal Load
Hit?
Update Metadata (LRU bits)
Serve the request
Fetch Cache Block from memory
Update Cache
Cache Controller
MissHit
CACHE FENCE (CFENCE)
�18
➤ if a cache hit -> read the contents of the cache,
➤ doesn’t change meta-data (e.g., LRU age bits)
➤ if a cache miss -> mark as uncacheable
➤ allow mem read without altering the cache state
Non-Modifying Load
Hit?
Serve the request
Fetch Cache Block from memory
Cache Controller
MissHit
RESULTS — FENCE ENFORCEMENT POLICIES
➤ Our CFENCE reduces the incurred performance overhead by 2.3X, bringing down the execution time overhead from 48% to 21%.
�19
ps sjeng bzip2 ping llu gcc ls nginx omnet gmeanInjected for All Kernel Loads
1.00
1.05
1.10
1.15
1.20
1.25
1.30
1.35
1.40
1.45
1.50
1.55
1.60
1.65
1.70
1.75
1.80
1.85
1.90
1.95
2.00
Nor
mal
ized
Exe
cution
Tim
e
LFENCE-LC
LSQ-MFENCE-LC
CFENCE-LC
48%
21%
CONTEXT SENSITIVE FENCING
➤ Surgically injects fence micro-ops
�20
Right Type of Fence Only When NecessaryNo Recompilation
FENCE FREQUENCY OPTIMIZATIONS
➤ Liberal Injection
➤ Injects fences before all the loads of a program
➤ completely stops speculation
�21
jeq
ld
add
ld
ld
FENCE FREQUENCY OPTIMIZATIONS
➤ Liberal Injection
➤ Injects fences before all the loads of a program
➤ completely stops speculation
�22
jeqFenceldadd
ld
ld
Fence
Fence
FENCE FREQUENCY OPTIMIZATIONS
➤ Basic Block-Level Fence Insertion*
➤ Speculation begins with a branch prediction
➤ A fence between branch and subsequent loads
�23
jeqFenceldadd
ld
ld
Fence
Fence
* Targeted Optimization — Only protects against variant 1
FENCE FREQUENCY OPTIMIZATIONS
➤ Basic Block-Level Fence Insertion
➤ Speculation begins with a branch prediction
➤ We want a fence between each branch and subsequent loads
�24
jeqFenceldaddldld
* Targeted Optimization — Only protects against variant 1
FENCE FREQUENCY OPTIMIZATIONS
➤ Taint-Based Fence Insertion
➤ Even one fence per basic block is too conservative
➤ Attacker performs operations based on untrusted data (e.g., attacker controlled out of bound index)
➤ Insert fences for only vulnerable loads that operate on untrusted data
➤ Dynamic Information Flow Tracker (DIFT)
�25
DLIFT- AN INFORMATION FLOW TRACKER FOR SPECTRE ERA
�26
➤ Classic Information Flow Trackers
➤ Maintain and Evaluate Taints at Late Stages of the Pipeline
➤ Not so useful for Spectre!
DLIFT- AN INFORMATION FLOW TRACKER FOR SPECTRE ERA
�26
➤ Classic Information Flow Trackers
➤ Maintain and Evaluate Taints at Late Stages of the Pipeline
➤ Not so useful for Spectre!
DLIFT- AN INFORMATION FLOW TRACKER FOR SPECTRE ERA
�26
➤ Classic Information Flow Trackers
➤ Maintain and Evaluate Taints at Late Stages of the Pipeline
➤ Not so useful for Spectre!Detect The Threat before it’s too late.
Register Taintedrax Norbx No
Register Taintedrax Norbx Yes
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Register Taintedrax Norbx No
Register Taintedrax Norbx Yes
add (%rbx), %rax
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Register Taintedrax Norbx No
Register Taintedrax Norbx Yes
add (%rbx), %rax
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Not Tainted
Register Taintedrax Norbx No
Register Taintedrax Norbx Yes
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Not Tainted
ld t1, (%rbx) add t1, %rax, %rax
Fence-Free Translation
Register Taintedrax Norbx No
Register Taintedrax Norbx Yes
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Not Tainted
ld t1, (%rbx) add t1, %rax, %raxFence-Free
Translation
Register Taintedrax Norbx No
Register Taintedrax Norbx Yes
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Not Tainted
Under-Tainted?
ld t1, (%rbx) add t1, %rax, %raxFence-Free
Translation
Register Taintedrax Norbx No Flush
Register Taintedrax Norbx Yes
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Not Tainted
Under-Tainted?
ld t1, (%rbx) add t1, %rax, %rax
Fence-Free Translation
Register Taintedrax Norbx No Flush
Register Taintedrax Norbx Yes
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Not Tainted
Taint Recovery Copy
Under-Tainted?
ld t1, (%rbx) add t1, %rax, %rax
Fence-Free Translation
Register Taintedrax Norbx Yes
Register Taintedrax Norbx No Flush
Register Taintedrax Norbx Yes
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Not Tainted
Taint Recovery Copy
Under-Tainted?
ld t1, (%rbx) add t1, %rax, %rax
Fence-Free Translation
Register Taintedrax Norbx Yes
add (%rbx), %rax
Register Taintedrax Norbx No Flush
Register Taintedrax Norbx Yes
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Taint Recovery Copy
Under-Tainted?
ld t1, (%rbx) add t1, %rax, %rax
Fence-Free Translation
Register Taintedrax Norbx Yes
add (%rbx), %rax
Tainted
Register Taintedrax Norbx No Flush
Register Taintedrax Norbx Yes
DECODER-LEVEL (SPECULATIVE) INFORMATION FLOW TRACKER
�27
Fetch M
acro
-to-M
icro
Speculative Taint Map
Taint Evaluator
Reg. File TLB
Arch. Taint Map/RF
Fetch/Decode Execute Commit
Tain
t
Tain
t
Commit Logic
Taint Recovery Copy
Under-Tainted?
ld t1, (%rbx) add t1, %rax, %rax
Register Taintedrax Norbx Yes
add (%rbx), %rax
Fence Injection
FENCE ld t1, (%rbx)
add t1, %rax, %rax
Tainted
RESULTS — FENCE FREQUENCY OPTIMIZATION
➤ Taint-Based CFENCE injection reduces the performance overhead to just 7.7%
�28
ps sjeng bzip2 ping llu gcc ls nginx omnet gmeanCFENCE
1.00
1.05
1.10
1.15
1.20
1.25
1.30
1.35
1.40
1.45
1.50
1.55
1.60
Nor
mal
ized
Exe
cution
Tim
e
Always Fencing
DLIFT-Based Fencing
DLIFT + Once per BB
CONTEXT-SENSITIVE FENCING
�29
Low Performance Overhead
CONTEXT-SENSITIVE FENCING
�29
No Recompilation
Low Performance Overhead
CONTEXT-SENSITIVE FENCING
�29
No Recompilation
Low Performance Overhead
Minimal Changes to Processor
THANKS! QUESTIONS?