Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization...
-
Upload
oliver-merritt -
Category
Documents
-
view
219 -
download
0
Transcript of Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization...
![Page 1: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/1.jpg)
Ioana Burcea*
Stephen Somogyi§, Andreas Moshovos*, Babak Falsafi§#
Predictor Virtualization
*University of Toronto
Canada
§Carnegie Mellon University
#École Polytechnique Fédérale de Lausanne
ASPLOS 13
March 4, 2008
![Page 2: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/2.jpg)
2Ioana Burcea Predictor Virtualization University of Toronto
Why Predictors? History Repeats Itself
CPU
Branch Prediction
Prefetching
Value Prediction
Pointer Caching
Cache Replacement
Predictors
Application footprints grow
Predictors need to scale to remain effective
![Page 3: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/3.jpg)
3Ioana Burcea Predictor Virtualization University of Toronto
Extra Resources: CMPs With Large On-Chip Caches
Main Memory
D$I$
CPU
D$I$
CPU
D$I$
CPU
D$I$
CPU
L2 Cache10’s – 100’s of MB
![Page 4: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/4.jpg)
4Ioana Burcea Predictor Virtualization University of Toronto
Predictor Virtualization
Physical Memory Address Space
D$I$
CPU
D$I$
CPU
D$I$
CPU
D$I$
CPU
L2 Cache
![Page 5: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/5.jpg)
5Ioana Burcea Predictor Virtualization University of Toronto
Predictor Virtualization (PV)
Emulate large predictor tables
Reduce predictor table dedicated resources
![Page 6: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/6.jpg)
6Ioana Burcea Predictor Virtualization University of Toronto
Research Contributions PV – metadata stored in conventional cache hierarchy
Benefits Emulate larger tables → increased accuracy Less dedicated resources
Why now? Large caches / CMPs / Need for larger predictors
Will this work? Metadata locality → intrinsically exploited by caches
First Step – Virtualized Data Prefetcher Performance: within 1% on average Space: 60KB down to < 1KB
Advantages of virtualization
![Page 7: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/7.jpg)
7Ioana Burcea Predictor Virtualization University of Toronto
PV architecture
PV in action Virtualized “Spatial Memory Streaming” [ISCA 06]*
Conclusions
*[ISCA 06] S. Somogyi, T. Wenisch, A. Ailamaki, B. Falsafi, and A. Moshovos. “Spatial Memory Streaming”
Talk Road Map
![Page 8: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/8.jpg)
8Ioana Burcea Predictor Virtualization University of Toronto
PV architecture
PV in action Virtualized “Spatial Memory Streaming” [ISCA 06]*
Conclusions
*[ISCA 06] S. Somogyi, T. Wenisch, A. Ailamaki, B. Falsafi, and A. Moshovos. “Spatial Memory Streaming”
Talk Road Map
![Page 9: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/9.jpg)
9Ioana Burcea Predictor Virtualization University of Toronto
PV Architecture
Virtualize
request prediction
D$I$
CPU
L2 Cache
Main Memory
Predictor
Table
Optimization
Engine
![Page 10: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/10.jpg)
10Ioana Burcea Predictor Virtualization University of Toronto
PV Architecture
request prediction
D$I$
CPU
L2 Cache
index
PVCache
PVProxy
Physical Memory Address Space
PVTable
Optimization
Engine
PVStart
![Page 11: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/11.jpg)
11Ioana Burcea Predictor Virtualization University of Toronto
PV: Variable Prediction Latency
request prediction
D$I$
CPU
L2 Cache
index
PVCache
PVProxy
Physical Memory Address Space
PVTable
Optimization
Engine
PVStart
Common
Case
Infrequent
Rare
![Page 12: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/12.jpg)
12Ioana Burcea Predictor Virtualization University of Toronto
Metadata Locality
Entry reuse Temporal
One entry used for multiple predictions
Spatial – can be engineered One miss overcome by several subsequent hits
Metadata access pattern predictability Predictor metadata prefetching
![Page 13: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/13.jpg)
13Ioana Burcea Predictor Virtualization University of Toronto
PV architecture
PV in action Virtualized “Spatial Memory Streaming” [ISCA 06]*
Conclusions
*[ISCA 06] S. Somogyi, T. Wenisch, A. Ailamaki, B. Falsafi, and A. Moshovos. “Spatial Memory Streaming”
Talk Road Map
![Page 14: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/14.jpg)
14Ioana Burcea Predictor Virtualization University of Toronto
Spatial Memory Streaming [ISCA 06]M
emor
y
spatial patterns
1100000001101…
1100001010001…Spatial patterns stored in a pattern history table (PHT)
*[ISCA 06] S. Somogyi, T. Wenisch, A. Ailamaki, B. Falsafi, and A. Moshovos.
“Spatial Memory Streaming”
![Page 15: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/15.jpg)
15Ioana Burcea Predictor Virtualization University of Toronto
data access stream
Virtualizing “Spatial Memory Streaming” (SMS)
Detector Predictor
patterns
patterns
prefetchestrigger access
Virtualize
~1KB ~60 KB
![Page 16: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/16.jpg)
16Ioana Burcea Predictor Virtualization University of Toronto
8 sets
Virtualizing SMS
VirtualTable1K
sets
11 ways
PVCache
11 ways
tag pattern
tag tagpattern
pattern
unused
11 bits 32 bits 39 bits
Set entries → cache block – 64 bytes
![Page 17: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/17.jpg)
17Ioana Burcea Predictor Virtualization University of Toronto
Current Implementation
Non-Intrusive Virtual table stored in reserved physical address space
One table per core
Caches oblivious to metadata
Options Predictor tables stored in virtual memory
Single, shared table per application
Caches aware of metadata
![Page 18: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/18.jpg)
18Ioana Burcea Predictor Virtualization University of Toronto
Simulation Infrastructure
SimFlex
Full-system simulator based on Simics
Base processor configuration
4-core CMP
8-wide OoO
256-entry ROB
L1D/L1I 64KB 4-way set-associative
UL2 8MB 16-way set-associative
Commercial workloads
TPC-C: DB2 and Oracle
TPC-H: Query 1, Query 2, Query 16, Query 17
SpecWeb: Apache and Zeus
![Page 19: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/19.jpg)
19Ioana Burcea Predictor Virtualization University of Toronto
0%
20%
40%
60%
80%
100%
120%
140%
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Apache Oracle Qry 17
Covered Uncovered Overpredictions
better
Original Prefetcher – Accuracy vs. Predictor Size
L1
Rea
d M
isse
s
![Page 20: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/20.jpg)
20Ioana Burcea Predictor Virtualization University of Toronto
0%
20%
40%
60%
80%
100%
120%
140%
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Apache Oracle Qry 17
Covered Uncovered Overpredictions
better
Original Prefetcher – Accuracy vs. Predictor Size
L1
Rea
d M
isse
s
![Page 21: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/21.jpg)
21Ioana Burcea Predictor Virtualization University of Toronto
0%
20%
40%
60%
80%
100%
120%
140%
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Apache Oracle Qry 17
Covered Uncovered Overpredictions
better
Original Prefetcher – Accuracy vs. Predictor Size
L1
Rea
d M
isse
s
![Page 22: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/22.jpg)
22Ioana Burcea Predictor Virtualization University of Toronto
Original Prefetcher – Accuracy vs. Predictor Size
Small Tables Diminish Prefetching Accuracy
0%
20%
40%
60%
80%
100%
120%
140%
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Infin
ite
1K-1
6a
1K-1
1a
512-
11a
256-
11a
128-
11a
64-1
1a
32-1
1a
16-1
1a
8-11
a
Apache Oracle Qry 17
Covered Uncovered Overpredictions
better
L1
Rea
d M
isse
s
![Page 23: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/23.jpg)
23Ioana Burcea Predictor Virtualization University of Toronto
Virtualized Prefetcher - Performance
Sp
eed
up
Original Prefetcher ~60KB
Virtualized Prefetcher < 1KB
better 0%
10%
20%
30%
40%
50%
60%
70%
Apache Zeus DB2 Oracle Qry 1 Qry 2 Qry 16 Qry 17
Original - 1K sets Original - 16 sets Original - 8 sets Virtualized - 8 sets
Hardware Cost
![Page 24: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/24.jpg)
24Ioana Burcea Predictor Virtualization University of Toronto
Impact on L2 Memory Requests
Dark Side: Increased L2 Memory Requests
better
L2
Mem
ory
Req
ues
ts I
ncr
eas
e
0%
10%
20%
30%
40%
Apache Oracle Qry 17
PV - 8 sets
![Page 25: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/25.jpg)
25Ioana Burcea Predictor Virtualization University of Toronto
Impact of Virtualization on Off-Chip Bandwidth
0%
1%
2%
3%
4%
5%
Apache Qry17 Oracle
App L2 Misses App L2 Write-backs
PV L2 Misses PV L2 Write-backs
Minimal Impact on Off-Chip Bandwidth
better
Off
-Ch
ip B
and
wid
th I
ncr
ease
Indirect impact on performance
Direct impact on performance
![Page 26: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/26.jpg)
26Ioana Burcea Predictor Virtualization University of Toronto
Conclusions
Predictor Virtualization Metadata stored in conventional cache hierarchy
Benefits Emulate larger tables → increased accuracy Less dedicated resources
First Step – Virtualized Data Prefetcher Performance: within 1% on average Space: 60KB down to < 1KB
Opportunities Metadata sharing and persistence Application directed prediction Predictor adaptation
![Page 27: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/27.jpg)
Ioana Burcea*[email protected]
Stephen Somogyi§, Andreas Moshovos*, Babak Falsafi§#
Predictor Virtualization
*University of Toronto
Canada
§Carnegie Mellon University
#École Polytechnique Fédérale de Lausanne
ASPLOS 13
March 4, 2008
![Page 28: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/28.jpg)
Ioana Burcea*[email protected]
Stephen Somogyi§, Andreas Moshovos*, Babak Falsafi§#
Predictor Virtualization
*University of Toronto
Canada
§Carnegie Mellon University
#École Polytechnique Fédérale de Lausanne
ASPLOS 13
March 4, 2008
![Page 29: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/29.jpg)
Ioana Burcea*[email protected]
Stephen Somogyi§, Andreas Moshovos*, Babak Falsafi§#
Predictor Virtualization
*University of Toronto
Canada
§Carnegie Mellon University
#École Polytechnique Fédérale de Lausanne
ASPLOS 13
March 4, 2008
![Page 30: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/30.jpg)
30Ioana Burcea Predictor Virtualization University of Toronto
PV – Motivating Trends
Dedicating resources to predictors hard to justify Larger predictor tables
Increased performance
Chip multiprocessors Space dedicated to predictors ↔ # processors
Memory hierarchies offer the opportunity Increased capacity
Diminishing returns
Use conventional memory hierarchies to store predictor metadata
![Page 31: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/31.jpg)
31Ioana Burcea Predictor Virtualization University of Toronto
Virtualizing the Predictor Table
Pattern History Table
Tag Pattern Tag Pattern…
…
…
1 0 1 0 1 1 1 0
1 0 1 0
1 0 1 1
0 0 1 1
0 0 1 1
PC
Trigger Access
Address
Tag index
Pattern
Prefetch
Virtualize
PHT stored in physical address space
Multiple PHT entries packed in one memory block
one memory request brings an entire table set
![Page 32: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/32.jpg)
32Ioana Burcea Predictor Virtualization University of Toronto
Packing Entries in One Cache Block
Index: PC + offset within spatial group PC →16 bits
32 blocks in a spatial group → 5 bit offset
→ 32 bit spatial pattern
Pattern table: 1K sets 10 bits to index the table → 11 bit tag
Cache block: 64 bytes 11 entries per cache block → Pattern table
1K sets – 11-way set associative
21 bit index
tag pattern
tag tagpattern
pattern0 11 43 54 85 unused
![Page 33: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/33.jpg)
33Ioana Burcea Predictor Virtualization University of Toronto
Memory Address Calculation
+000000
16 bits 5 bits
10 bits
PV Start Address
Block offset
Memory Address
PC
tag
![Page 34: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/34.jpg)
34Ioana Burcea Predictor Virtualization University of Toronto
Increase in Off-Chip Bandwidth – different L2 sizes
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
2MB
4MB
8MB
2MB
4MB
8MB
2MB
4MB
8MB
2MB
4MB
8MB
2MB
4MB
8MB
2MB
4MB
8MB
2MB
4MB
8MB
2MB
4MB
8MB
Apache Zeus DB2 Oracle Qry1 Qry2 Qry16 Qry17
Write-backs
L2 Misses
Off
-Ch
ip B
and
wid
th I
ncr
ease
![Page 35: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/35.jpg)
35Ioana Burcea Predictor Virtualization University of Toronto
Increased L2 Latency
0%
10%
20%
30%
40%
50%
60% SMS - 1K SMS - PV8
Sp
eed
up
![Page 36: Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.](https://reader030.fdocuments.net/reader030/viewer/2022032607/56649ec65503460f94bd1046/html5/thumbnails/36.jpg)
36Ioana Burcea Predictor Virtualization University of Toronto
Conclusions PV – metadata stored in conventional cache hierarchy
Benefits Less dedicated resources Emulate larger tables → increased accuracy
Example – Virtualized Data Prefetcher Performance: within 1% on average Space: 60KB down to < 1KB
Why now? Large caches / CMPs / Need for larger predictors
Will this work? Metadata locality → intrinsically exploited by caches Metadata access pattern predictability
Opportunities Metadata sharing and persistence Application directed prediction Predictor adaptation