FreshCache : Statically and Dynamically Exploiting Dataless Ways
description
Transcript of FreshCache : Statically and Dynamically Exploiting Dataless Ways
![Page 1: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/1.jpg)
FreshCache: Statically and Dynamically Exploiting Dataless Ways
Arkaprava Basu, Derek R. Hower, Mark D. Hill, Mike M. Swift
![Page 2: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/2.jpg)
Last Level Caches: Area and Energy Hungry
Intel Ivy Bridge die picture
![Page 3: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/3.jpg)
Last Level Caches: Area and Energy Hungry
LLC contributes up to 37% of on-chip power [Sen et al.,
2013, UW-TR 1791]
Intel Ivy Bridge die picture
![Page 4: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/4.jpg)
Inefficiencies in LLC
• Inclusive LLC wastes energy and area – Transistors devoted to hold stale data
![Page 5: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/5.jpg)
Inefficiencies in LLC
• Inclusive LLC wastes energy and area – Transistors devoted to hold stale data
LLC + Directory
Private Caches (L1/L2)
C1 C2
A :x
A :x
TAG DATA
Block A is cached with exclusive permission in C1’s private cache
A :y
![Page 6: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/6.jpg)
Inefficiencies in LLC
• Inclusive LLC wastes energy and area – Transistors devoted to hold stale data
• Amount of stale data varies across workloads
Frac
tion
of st
ale
data
in LL
C bl
ocks
blacksc
holes
canneal
facesim
fluidanim
ate
freqmine
stream
cluste
r
swap
tionsx2
64
graph500
memcached
SpecJB
BMean
0.1
0.15
0.2
0.25
0.3
0.35
0.4 0.7
Private Cache: LLC ratio ~ 1:4
![Page 7: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/7.jpg)
Idea: FreshCache
• Static: – Omit data portion of a fixed number of waysReduce area and energy overhead
• Dynamic :– Disable data ways at runtimeReduce more energy for when possible
![Page 8: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/8.jpg)
Roadmap
• Motivation and key idea• FreshCache: Static + Dynamic Dataless Ways• Design and Mechanisms• Evaluation• Summary
![Page 9: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/9.jpg)
Static Dataless Ways (SDWs)
TAG + Metadata
Data
Set
WaySet-associative LLC
![Page 10: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/10.jpg)
Static Dataless Ways (SDWs)
Set-associative LLC
Number of dataless ways fixed at design time
Static Dataless Way
✔ Saves both area and static power*
✗ Cannot adapt to workloads
* If blocks with stale data kept in SDWs
![Page 11: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/11.jpg)
Dynamic Dataless Ways (DDWs)
Set-associative LLC
Number of dataless ways adjusted at runtime
Data ways Turned off
Workload A
Dynamic Dataless Ways
![Page 12: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/12.jpg)
Dynamic Dataless Ways (DDWs)
Set-associative LLC
Number of dataless ways adjusted at runtimeWorkload B
Cache utilization is less for workload B
![Page 13: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/13.jpg)
Dynamic Dataless Ways (DDWs)
Set-associative LLC
Number of dataless ways adjusted at runtime
Data ways Turned off
Workload B
✔ Opportunistically save more energy
✗ No area savings
![Page 14: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/14.jpg)
FreshCache Goals: Best of Both Worlds
• Static: save area and energy– Omitting transistors at design time
• Dynamic: save more energy– Turning off transistor when possible
• How to tradeoff performance?– Bounded by Maximum Performance Degradation• e.g., MPD = 1% or 3%
– Minimize energy subject to MPD
![Page 15: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/15.jpg)
FreshCache: Static + Dynamic Dataless Ways
Workload A/B
Static Dataless WaysDynamic Dataless Ways
![Page 16: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/16.jpg)
FreshCache: Challenges
• Put blocks with stale data in dataless ways
• Determine number of DDWs at runtime
1
2
![Page 17: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/17.jpg)
Roadmap
• Motivation• FreshCache: Static + Dynamic Dataless Ways• Mechanisms– LLC Controller Manage Dataless ways– DDW Controller Determine number of DDWs
• Evaluation• Summary
1
2
![Page 18: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/18.jpg)
Dataless-Way-Aware LLC Controller
Coherence state decides if cache block put in dataless way
From Memory/Other Socket
• Keep blocks with stale data in dataless ways1
Exclusive stateSDW or DDW
![Page 19: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/19.jpg)
Dataless-Way-Aware LLC Controller
Coherence state decides if cache block put in dataless way
From Memory/Other Socket
• Keep blocks with stale data in dataless ways1
Shared stateSDW or DDW
![Page 20: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/20.jpg)
Dataless-Way-Aware LLC Controller
Writeback to dataless way may move block to conventional way
Intra-set block movement
• Keep blocks with stale data in dataless ways1
Writeback from Private $
![Page 21: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/21.jpg)
DDW Controller• Determines number of DDWs at runtime
DDW Cont.
LLC miss Estimator
Avg. Mem. Latency Hit Counters
Maximum Performance Degradation (MPD) Energy savings
Est. LLC missAggregator
Aux. Tag Array
2
Software specifies performance vs. energy savings tradeoff• MPD value specified in a register• Energy savings subjected to MPD
Qureshi’06
0.3% overhead
![Page 22: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/22.jpg)
DDW Controller• Determines number of DDWs at runtime
DDW Cont.
LLC miss Estimator
Avg. Mem. Latency Hit Counters
Maximum Performance Degradation (MPD) Energy savings
Est. LLC missAggregator
Aux. Tag Array
2
Qureshi’07
![Page 23: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/23.jpg)
Roadmap
• Motivation• FreshCache: Static + Dynamic Dataless Ways• Mechanisms• Evaluation• Summary
![Page 24: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/24.jpg)
Methodology
• gem5 full system simulation• 8 in-order cores, 3-level cache hierarchy• Parsec and commercial workloads• CACTI 6.5 to evaluate area and energy savings
• Evaluation:– Efficacy of FreshCache in saving energy– Area savings due to FreshCache
![Page 25: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/25.jpg)
Energy Savings: MPD=1%
Relative Energy (LLC + DRAM access) Savings
28%
2 SDWs (out 16 ways) + variable number of DDWs
Perc
enta
ge (%
)
Avg. 28% energy savings with worst case perf. Degradation < 1%
![Page 26: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/26.jpg)
Energy Savings: MPD= 3%
Relative Energy (LLC + DRAM access) Savings
28%41%
2 SDWs (out 16 ways) + variable number of DDWs
MPD = 1%
Perc
enta
ge (%
)
Avg. 41% energy savings with worst case perf. Degradation < 3%
![Page 27: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/27.jpg)
Area Savings
Relative Energy (LLC + DRAM access) Savings
28%41%
2 SDWs (out 16 ways) + variable number of DDWs
MPD = 1%
Perc
enta
ge (%
)
8.23% of LLC area saved
![Page 28: FreshCache : Statically and Dynamically Exploiting Dataless Ways](https://reader036.fdocuments.net/reader036/viewer/2022062222/56816335550346895dd3c1a5/html5/thumbnails/28.jpg)
Summary
• LLC can be energy and area hungry• Inclusive LLCs holds substantial stale data• FreshCache:– Static Dataless Ways to save area and power– Dynamic Dataless Ways to save further power
• 28% Energy and 8.23% LLC area savings– Worst case performance degradation <1%