University of Toronto Department of Electrical and Computer Engineering Jason Zebchuk and Andreas...
-
Upload
basil-daniel -
Category
Documents
-
view
214 -
download
0
description
Transcript of University of Toronto Department of Electrical and Computer Engineering Jason Zebchuk and Andreas...
University of TorontoUniversity of TorontoDepartment of Electrical and Computer EngineeringDepartment of Electrical and Computer Engineering
Jason ZebchukJason Zebchukand Andreas Moshovosand Andreas Moshovos
{zebchuk,moshovos}@eecg.toronto.edu{zebchuk,moshovos}@eecg.toronto.edu
Workshop on Complexity-Effective Design - June 2006June 2006
RegionTracker:RegionTracker:Using Dual-Grain Tracking forUsing Dual-Grain Tracking forEnergy Efficient Cache LookupEnergy Efficient Cache Lookup
June 18, 2006 Zebchuk© 2RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
I$D$
CPU
L2 D
ATA
Need for Energy EfficientL2 Lookups
Locate blocks in high level caches more efficiently Conventional tags are getting larger
Technology, microarchitectural and application trends Larger caches use more energy
Demonstrate lookup energy reductions up to 82% Up to 38% average across SPEC
L2 T
AGS
June 18, 2006 Zebchuk© 3RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Dual-Grain Tracking
Memory as a collection of REGIONS
Memory as a collection of blocks
Region: 2n sized, aligned memory area
Similar concept already used by various structures TLB, Page Table
June 18, 2006 Zebchuk© 4RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Program Behavior / Motivation
Few active Regions “Bursty” access Mostly gone before accessed again RegionTracker:
Identify First Misses Track block location for Few Regions
In principle In practiceAnd before
is touched again
How can this reduce energy?
June 18, 2006 Zebchuk© 5RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
I$D$
CPU
L2 D
ATA
RegionTracker: Low Power Lookups
Frequent case: Few Active Regions Macroscopically Transient
RegionTracker: Dynamically Identify Newly Touched Regions Track block location using a compact structure
L2 T
AGS
I$D$
CPU
L2 D
ATA
L2 T
AGS
June 18, 2006 Zebchuk© 6RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
RegionTracker Organization
CRH for First Miss Detection: 5% of tags
CBV for Tracking blocks within 128 regions: 17.5%
128 x 8kB regions = 1MB tracked (at most 25% of a 4MB L2)
I$D$
CPU
L2 D
ATA
L2 T
AGS
June 18, 2006 Zebchuk© 7RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Which Regions are Cached?
If we had as many counters as regions: Block Allocation: counter[region]++ Block Eviction: counter[region]-- Region cached only if counter[Region] non-zero
Not Practical: E.g., 8KB Regions and 4GB Memory 512K counters
Region Tag offset
counter
June 18, 2006 Zebchuk© 8RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Which Regions are Cached?
Region Tag offset
counter
hash()
Imprecise: Records a superset of currently cached Regions False positives: lost opportunity, correctness preserved Small: e.g., 512-4k entries for 2MB or 4MB cache
First Miss: Full location information for ALL BLOCKS No need for temporal locality
Cached Region Hash(CRH)
June 18, 2006 Zebchuk© 9RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
CBV: Tracking Blocks within Regions
Region Tag
Block infoRegion Tag offsetblock
Block #0 Block #63
Which data way is the block cached at?
Parallel lookup of RegionTag and Block Info Experiments with 64 and 128 entry, 8-way set-
associative CBV
4
256
June 18, 2006 Zebchuk© 10RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Conventional Solution
Tag Hierarchy Requires Locality
Temporal Spatial as long as L2 block size > L1 block size
Latency limited Not very energy efficient RegionTracker is Better
I$D$
CPU
L2 D
ATA
L2 T
AGS
June 18, 2006 Zebchuk© 11RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Tag Hierarchy
Set Tag
Block Tag OffsetSet
Tag #0 Tag #7
Each access reads/writes 23 bytes Sequential Comparison of Set Tag AND Block Tag
23
184
= = = = = = = =
June 18, 2006 Zebchuk© 12RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Complexity Tradeoffs
Tag Heirarchy Read/Write 184 bits
Complex Wiring to transfer 184 bits Updated on every Tag Hierarchy miss
RegionTracker Read/Write 4 bits
Only 4 bits transferred from tag array Updated on L2 misses only Flexible implementation (vertical/horizontal partitioning) No modification to conventional cache policies/structures
June 18, 2006 Zebchuk© 13RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Methodology
Processor Deeply-Pipelined 128-entry window 8-way superscalar 32kB L1 instruction and data caches
Spec CPU 2000 / Reference Inputs
10 Billion Committed Instr. Samples after 100B
Used CACTI to estimate energy requirements
June 18, 2006 Zebchuk© 14RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Energy Savings w/ 4MB L2
Average reduction of 38% Up to 82% reduction (gzip) Robust performance, significant power savings for most programs
Bet
ter
CRH/CBV:
-20%
0%
20%
40%
60%
80%
100%
applu lucas art twolf mgrid fma3d gzip vortex vpr Average
% L
2 Lo
okup
Ene
rgy
Save
d1k/64 4k/64 1k/128 4k/128
June 18, 2006 Zebchuk© 15RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
Tag Hierarchy Savings
-40%
-30%
-20%
-10%
0%
10%
2MB 4MB 8MB 16MB
L2 Cache Size
% L
2 Lo
okup
Ene
rgy
Red
uctio
n16 32 64 128
Only 2 configurations actually save power! Similar fraction of requests served by RegionTracker RegionTracker much better!
Sets:
June 18, 2006 Zebchuk© 16RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
RegionTracker Summary
Coarse-Grain tracking to capture first misses Dual-Grain tracking to track blocks
Service many L2 Requests Reduce L2 Lookup Energy
Does not require temporal locality Can exploit spatial locality much better than a tag
hierarchy
Significantly reduces L2 Lookup Power with minimal additional complexity
June 18, 2006 Zebchuk© 17RegionTracker: Using Dual-GrainTracking for Energy Efficient Cache Lookup
RegionTracker:RegionTracker:Using Dual-Grain Tracking for Using Dual-Grain Tracking for Energy Efficient Cache LookupEnergy Efficient Cache Lookup
Jason Zebchuk Jason Zebchuk and Andreas Moshovosand Andreas Moshovos{zebchuk, moshovos}@eecg.toronto.edu{zebchuk, moshovos}@eecg.toronto.edu
University of TorontoUniversity of TorontoDepartment of Electrical and Computer EngineeringDepartment of Electrical and Computer Engineering