Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department...
-
Upload
alexander-golden -
Category
Documents
-
view
219 -
download
3
Transcript of Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department...
![Page 1: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/1.jpg)
Skewed Compressed CacheMICRO 2014.
Somayeh Sardashti, David A. Wood
Computer Sciences Department University of Wisconsin-Madison
![Page 2: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/2.jpg)
SCC
• Off-chip access -> latency, BW, Power
• LLC size already big => Effective ca-pacity
![Page 3: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/3.jpg)
Cache Compression
• Observation : many cache lines have low dynamic range data.
![Page 4: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/4.jpg)
SCCDesigning a Compressed Cache.
• (1) a compression algorithm to com-press blocks
• (2) a compaction mechanism to fit compressed blocks in the cache.
*In general, SCC is independent of the compression algorithm in use.
![Page 5: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/5.jpg)
Motivation.
• How can we design a compressed cache?
(Design goal)• 1. tightly compacting variable-size com-
pressed blocks.• 2. keeping tag and other metadata over-
head low• 3. allowing fast lookups.=> Previous compressed cache designs failed to achieve all these goals.
![Page 6: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/6.jpg)
Compressed Cache Taxon-omy
1) How to provide additional tags2) How to find the corresponding block given a matching tag.
![Page 7: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/7.jpg)
SCC
• Key Observation.• 1) spatial locality • ( neighboring blocks tend to reside in the cache at the same time)
• 2) compression locality • ( neighboring blocks tend to compress similarly )
![Page 8: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/8.jpg)
SCC
48bits PA
tag data
8B
64Byte = 16W
32B
16B
8B
CF = 2b00
CF = 2b11
CF = 2b10
CF = 2b01
Superblock tag
![Page 9: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/9.jpg)
SSC
![Page 10: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/10.jpg)
SuperBlock Cache
![Page 11: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/11.jpg)
![Page 12: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/12.jpg)
16-way set-associative Cache
4-way set associative
Cache Block
Address 48bits047
subblock
![Page 13: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/13.jpg)
SuperBlock
1 Superblock = 8 contiguous blocks = 64Bytes x 8 = 512B
047 58
Byte Select
9 61011
6bits -> 64B
Block ID
![Page 14: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/14.jpg)
047 589 61011
xor
Way group Selection
Superblock tag
![Page 15: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/15.jpg)
047 589 61011
![Page 16: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/16.jpg)
047 589 61011
예 :
![Page 17: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/17.jpg)
예 )
047 589 61011
![Page 18: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/18.jpg)
2-way Skewed Cache.
![Page 19: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/19.jpg)
SCC
• 16-way cache with 8 cache sets into 4 way groups.• 64Byte cache block, 8-block Superblocks. (1,2,4 or 8 subblocks)• Separate sparse super-block tag
![Page 20: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/20.jpg)
SCC
• * 97% of updated blocks fit in original place.
![Page 21: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/21.jpg)
Area Overhead
Baseline : conventional 16-way 8MB LLCFixedC : doubles the # of tags. Compression only to half the size.VSC : 0-4 16B subblocksDCC4-16 : 0-4 16B subblocksSCC8-8 : 0-8 8B subblocks
![Page 22: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/22.jpg)
Methodology• GEMS simulator, CACTI6.5 (area, power at 32nm)• Run mixes of multi-programmed workloads from memory
bound and compute bound SPEC CPU 2006 benchmarks.
Baseline : conventional 16-way 8MB LLC
2XBaseline : conventional 32-way 16MB LLC
![Page 23: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/23.jpg)
Evaluation-MPKI
• 2X Baseline – average 15% im-provement
• SCC – avg. 13%
![Page 24: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/24.jpg)
Evaluation-Energy
• SCC improves system energy up to 20%.• Avg. 6%
![Page 25: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/25.jpg)
Conclusion
• SCC achieves performance compara-ble to that of a conventional cache with twice the capacity and associa-tivity with less area overhead 1.5%. (DCC - 6.8%)
= Area overhead : SCC 1.5% vs DCC 6.8%
• Lower design complexity. = Replacement mechanism is simpler than DCC
![Page 26: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/26.jpg)
![Page 27: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/27.jpg)
![Page 28: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/28.jpg)
FixedC
![Page 29: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/29.jpg)
VSC
![Page 30: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/30.jpg)
DCC
![Page 31: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/31.jpg)
SSC
![Page 32: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/32.jpg)
Sector Cache
![Page 33: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/33.jpg)
![Page 34: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/34.jpg)
2-way Skewed Cache.
![Page 35: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/35.jpg)
Cache Compression
[Goal]• Fast (low decompression latency)• Simple (avoid complex hardware
changes)• Effective (good compression ratio)
![Page 36: Skewed Compressed Cache MICRO 2014. Somayeh Sardashti, David A. Wood Computer Sciences Department University of Wisconsin-Madison.](https://reader036.fdocuments.net/reader036/viewer/2022062715/56649d705503460f94a52a1a/html5/thumbnails/36.jpg)
Motivation
• Off-chip memory latency is high.• -> larger cache reduce misses at the cost
of bigger area and power. • Off-chip memory access requires high en-
erygy.• -> larger cache reduce accesses to Off-chip
memory.• Off-chip interconnects bandwidth is limited.• -> larger cache