Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen ([email protected]) Massachusetts...
Transcript of Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen ([email protected]) Massachusetts...
![Page 1: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/1.jpg)
Towards Thousand-Core RISC-V Shared Memory Systems
Quan Nguyen ([email protected]) Massachusetts Institute of Technology
30 November 2016
![Page 2: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/2.jpg)
Outline
• The Tardis cache coherence protocol – Example – Scalability advantages
• Thousand-core prototype • RISC-V and Tardis
2
![Page 3: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/3.jpg)
Tardis
3
![Page 4: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/4.jpg)
Tardis
• Scalable cache coherence protocol
3
![Page 5: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/5.jpg)
Tardis
• Scalable cache coherence protocol– N-core system: O(log N) storage
3
![Page 6: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/6.jpg)
Tardis
• Scalable cache coherence protocol– N-core system: O(log N) storage
• Enforces consistency through timestamps
3
![Page 7: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/7.jpg)
Tardis
• Scalable cache coherence protocol– N-core system: O(log N) storage
• Enforces consistency through timestamps• Key idea: logical leases
3
![Page 8: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/8.jpg)
Tardis
• Scalable cache coherence protocol– N-core system: O(log N) storage
• Enforces consistency through timestamps• Key idea: logical leases– Can read if have valid lease
3
![Page 9: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/9.jpg)
Tardis
• Scalable cache coherence protocol– N-core system: O(log N) storage
• Enforces consistency through timestamps• Key idea: logical leases– Can read if have valid lease– Can write if lease expires
3Xiangyao Yu and Srinivas Devadas, “Tardis: Time Traveling Coherence Algorithm for Distributed Shared Memory”, PACT 2015.
![Page 10: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/10.jpg)
Block diagram
4
![Page 11: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/11.jpg)
Block diagramCore
D$Core
D$
4
![Page 12: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/12.jpg)
Block diagramCore
D$Core
D$
Last-level cache
Network-on-chip
4
![Page 13: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/13.jpg)
Block diagramCore
D$Core
D$
Last-level cache
Main memory
Network-on-chip
4
![Page 14: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/14.jpg)
New state
5
![Page 15: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/15.jpg)
New state
• Per cache line:
5
![Page 16: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/16.jpg)
New state
• Per cache line:– Client:
tag state wts rts
5
![Page 17: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/17.jpg)
New state
• Per cache line:– Client:
– Manager:tag state owner wts rts
tag state wts rts
5
![Page 18: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/18.jpg)
New state
• Per cache line:– Client:
– Manager:
• Per core:
tag state owner wts rts
tag state wts rts
5
pts
![Page 19: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/19.jpg)
Example
6
![Page 20: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/20.jpg)
Example
6
Core 0: Core 1: store A load B store B load A load B
![Page 21: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/21.jpg)
Example
6
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
![Page 22: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/22.jpg)
Example
6
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
![Page 23: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/23.jpg)
Example
6
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: I owner: wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
![Page 24: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/24.jpg)
Example
6
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: I owner: wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
![Page 25: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/25.jpg)
Example
6
• Core 0 stores A
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: I owner: wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
![Page 26: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/26.jpg)
Example
6
• Core 0 stores A• Manager leases A
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: I owner: wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
A state: M owner: 0 wts: 0 rts: 0
![Page 27: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/27.jpg)
Example
6
• Core 0 stores A• Manager leases A– Sets owner, wts, rts
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: I owner: wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
A state: M owner: 0 wts: 0 rts: 0
![Page 28: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/28.jpg)
Example
6
• Core 0 stores A• Manager leases A– Sets owner, wts, rts
• Core 0:
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: I owner: wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
A state: M owner: 0 wts: 0 rts: 0
![Page 29: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/29.jpg)
Example
6
• Core 0 stores A• Manager leases A– Sets owner, wts, rts
• Core 0:– Gets A with
[wts, rts] = [0, 0]
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
A state: M wts: 0 rts: 0
Manager
A state: I owner: wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
A state: M owner: 0 wts: 0 rts: 0
![Page 30: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/30.jpg)
Example
6
• Core 0 stores A• Manager leases A– Sets owner, wts, rts
• Core 0:– Gets A with
[wts, rts] = [0, 0]– Writes A’, creates new
version at [1, 1]
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
A state: M wts: 0 rts: 0A’ state: M wts: 1 rts: 1
Manager
A state: I owner: wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
A state: M owner: 0 wts: 0 rts: 0
![Page 31: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/31.jpg)
Example
6
• Core 0 stores A• Manager leases A– Sets owner, wts, rts
• Core 0:– Gets A with
[wts, rts] = [0, 0]– Writes A’, creates new
version at [1, 1]– Updates its pts to 1
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
pts: 1
A state: M wts: 0 rts: 0A’ state: M wts: 1 rts: 1
Manager
A state: I owner: wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
A state: M owner: 0 wts: 0 rts: 0
![Page 32: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/32.jpg)
7
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
Example
![Page 33: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/33.jpg)
7
• Core 0 loads B
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
Example
![Page 34: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/34.jpg)
7
• Core 0 loads B– Sends pts to manager
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0
Example
![Page 35: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/35.jpg)
7
• Core 0 loads B– Sends pts to manager
• Manager leases B
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0B state: S owner: wts: 1 rts: 11
Example
![Page 36: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/36.jpg)
7
• Core 0 loads B– Sends pts to manager
• Manager leases B– Sets lease based on pts
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0B state: S owner: wts: 1 rts: 11
Example
![Page 37: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/37.jpg)
7
• Core 0 loads B– Sends pts to manager
• Manager leases B– Sets lease based on pts– [wts, rts] = [1, 11]
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0B state: S owner: wts: 1 rts: 11
Example
![Page 38: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/38.jpg)
7
• Core 0 loads B– Sends pts to manager
• Manager leases B– Sets lease based on pts– [wts, rts] = [1, 11]
• Core 0 reads B at pts 1
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: I wts: rts:
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: I owner: wts: 0 rts: 0B state: S owner: wts: 1 rts: 11
B state: S wts: 1 rts: 11
Example
![Page 39: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/39.jpg)
8
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: S owner: wts: 1 rts: 11
Example
![Page 40: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/40.jpg)
8
• Core 1 stores B
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: S owner: wts: 1 rts: 11
Example
![Page 41: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/41.jpg)
8
• Core 1 stores B– Sends pts to manager
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: S owner: wts: 1 rts: 11B state: M owner: 1 wts: 1 rts: 11
Example
![Page 42: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/42.jpg)
8
• Core 1 stores B– Sends pts to manager
• Instantly grant Core 1 exclusive ownership
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: S owner: wts: 1 rts: 11B state: M owner: 1 wts: 1 rts: 11
B’ state: M wts: 12 rts: 12
Example
![Page 43: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/43.jpg)
8
• Core 1 stores B– Sends pts to manager
• Instantly grant Core 1 exclusive ownership
• Core 1 writes B’ at pts 12
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: S owner: wts: 1 rts: 11B state: M owner: 1 wts: 1 rts: 11
B’ state: M wts: 12 rts: 12
pts: 12
Example
![Page 44: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/44.jpg)
8
• Core 1 stores B– Sends pts to manager
• Instantly grant Core 1 exclusive ownership
• Core 1 writes B’ at pts 12
• Different versions of B coexist!
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 0
A state: I wts: rts:
B state: I wts: rts:
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: S owner: wts: 1 rts: 11B state: M owner: 1 wts: 1 rts: 11
B’ state: M wts: 12 rts: 12
pts: 12
Example
![Page 45: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/45.jpg)
9
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A state: I wts: rts:
B’ state: M wts: 12 rts: 12
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: M owner: 1 wts: 12 rts: 12
Example
![Page 46: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/46.jpg)
9
• Core 1 loads A
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A state: I wts: rts:
B’ state: M wts: 12 rts: 12
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: M owner: 1 wts: 12 rts: 12
Example
![Page 47: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/47.jpg)
9
• Core 1 loads A• Manager sends Core 0
writeback request
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A state: I wts: rts:
B’ state: M wts: 12 rts: 12
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: M owner: 1 wts: 12 rts: 12
Example
![Page 48: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/48.jpg)
9
• Core 1 loads A• Manager sends Core 0
writeback request• Core 0 downgrades
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A state: I wts: rts:
B’ state: M wts: 12 rts: 12
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: M owner: 1 wts: 12 rts: 12
A’ state: S wts: 1 rts: 1
Example
![Page 49: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/49.jpg)
9
• Core 1 loads A• Manager sends Core 0
writeback request• Core 0 downgrades
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A state: I wts: rts:
B’ state: M wts: 12 rts: 12
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: M owner: 1 wts: 12 rts: 12
A’ state: S wts: 1 rts: 1
A’ state: S owner: wts: 1 rts: 1
Example
![Page 50: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/50.jpg)
9
• Core 1 loads A• Manager sends Core 0
writeback request• Core 0 downgrades• Core 1 receives new
lease based on its pts
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A state: I wts: rts:
B’ state: M wts: 12 rts: 12
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: M owner: 1 wts: 12 rts: 12
A’ state: S wts: 1 rts: 1
A’ state: S owner: wts: 1 rts: 1A’ state: S owner: wts: 12 rts: 22
Example
![Page 51: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/51.jpg)
9
• Core 1 loads A• Manager sends Core 0
writeback request• Core 0 downgrades• Core 1 receives new
lease based on its pts– [wts, rts] = [12, 22]
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A state: I wts: rts:
B’ state: M wts: 12 rts: 12
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: M owner: 1 wts: 12 rts: 12
A’ state: S wts: 1 rts: 1
A’ state: S owner: wts: 1 rts: 1A’ state: S owner: wts: 12 rts: 22
Example
![Page 52: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/52.jpg)
9
• Core 1 loads A• Manager sends Core 0
writeback request• Core 0 downgrades• Core 1 receives new
lease based on its pts– [wts, rts] = [12, 22]
• Core 1 reads A’ at pts 12
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: M wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A state: I wts: rts:
B’ state: M wts: 12 rts: 12
Manager
A state: M owner: 0 wts: 0 rts: 0
B state: M owner: 1 wts: 12 rts: 12
A’ state: S wts: 1 rts: 1
A’ state: S owner: wts: 1 rts: 1
A’ state: S wts: 12 rts: 22
A’ state: S owner: wts: 12 rts: 22
Example
![Page 53: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/53.jpg)
10
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: S wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A’ state: S wts: 12 rts: 22
B’ state: M wts: 12 rts: 12
Manager
A’ state: S owner: wts: 12 rts: 22
B state: M owner: 1 wts: 12 rts: 12
Example
![Page 54: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/54.jpg)
10
• Core 0 loads B
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: S wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A’ state: S wts: 12 rts: 22
B’ state: M wts: 12 rts: 12
Manager
A’ state: S owner: wts: 12 rts: 22
B state: M owner: 1 wts: 12 rts: 12
Example
![Page 55: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/55.jpg)
10
• Core 0 loads B• Cache hit; simply loads
B from data cache
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: S wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A’ state: S wts: 12 rts: 22
B’ state: M wts: 12 rts: 12
Manager
A’ state: S owner: wts: 12 rts: 22
B state: M owner: 1 wts: 12 rts: 12
Example
![Page 56: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/56.jpg)
10
• Core 0 loads B• Cache hit; simply loads
B from data cache• Sequential order ( ) ≠ physical order ( )
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: S wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A’ state: S wts: 12 rts: 22
B’ state: M wts: 12 rts: 12
Manager
A’ state: S owner: wts: 12 rts: 22
B state: M owner: 1 wts: 12 rts: 12
Example
![Page 57: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/57.jpg)
10
• Core 0 loads B• Cache hit; simply loads
B from data cache• Sequential order ( ) ≠ physical order ( )
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: S wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A’ state: S wts: 12 rts: 22
B’ state: M wts: 12 rts: 12
Manager
A’ state: S owner: wts: 12 rts: 22
B state: M owner: 1 wts: 12 rts: 12
Example
![Page 58: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/58.jpg)
10
• Core 0 loads B• Cache hit; simply loads
B from data cache• Sequential order ( ) ≠ physical order ( )
Core 0: Core 1: store A load B store B load A load B
Core 0 pts: 1
A’ state: S wts: 1 rts: 1
B state: S wts: 1 rts: 11
Core 1 pts: 12
A’ state: S wts: 12 rts: 22
B’ state: M wts: 12 rts: 12
Manager
A’ state: S owner: wts: 12 rts: 22
B state: M owner: 1 wts: 12 rts: 12
Example
![Page 59: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/59.jpg)
A case for scalability
11
![Page 60: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/60.jpg)
A case for scalability
• Track only one node: O(log N) storage
11
![Page 61: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/61.jpg)
A case for scalability
• Track only one node: O(log N) storage• No broadcast invalidations
11
![Page 62: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/62.jpg)
A case for scalability
• Track only one node: O(log N) storage• No broadcast invalidations• Timestamps not tied to core count
11
![Page 63: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/63.jpg)
A case for scalability
• Track only one node: O(log N) storage• No broadcast invalidations• Timestamps not tied to core count– Can be compressed
11
![Page 64: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/64.jpg)
A case for scalability
• Track only one node: O(log N) storage• No broadcast invalidations• Timestamps not tied to core count– Can be compressed– No need for synchronized real-time clocks
11
![Page 65: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/65.jpg)
Outline
• The Tardis cache coherence protocol • Thousand-core prototype • RISC-V and Tardis
12
![Page 66: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/66.jpg)
Thousand-core shared memory systems
13
![Page 67: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/67.jpg)
Thousand-core shared memory systems
• Fit as many cores will fit on a ZC706
13
![Page 68: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/68.jpg)
Thousand-core shared memory systems
• Fit as many cores will fit on a ZC706• Connect in a 3D mesh
13
![Page 69: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/69.jpg)
Thousand-core shared memory systems
• Fit as many cores will fit on a ZC706• Connect in a 3D mesh– Aurora links, six connectors per board
13
![Page 70: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/70.jpg)
Thousand-core shared memory systems
• Fit as many cores will fit on a ZC706• Connect in a 3D mesh– Aurora links, six connectors per board
• Demonstrate shared memory at scale
13
![Page 71: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/71.jpg)
Thousand-core shared memory systems
• Fit as many cores will fit on a ZC706• Connect in a 3D mesh– Aurora links, six connectors per board
• Demonstrate shared memory at scale• Name:
13
![Page 72: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/72.jpg)
Thousand-core shared memory systems
• Fit as many cores will fit on a ZC706• Connect in a 3D mesh– Aurora links, six connectors per board
• Demonstrate shared memory at scale• Name:
13
T-1000
![Page 73: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/73.jpg)
Thousand-core shared memory systems
• Fit as many cores will fit on a ZC706• Connect in a 3D mesh– Aurora links, six connectors per board
• Demonstrate shared memory at scale• Name:
13Terminator 2: Judgment Day (1991) Carolco Pictures, Lightstorm Entertainment, Le Studio Canal+, and TriStar Pictures
T-1000
![Page 74: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/74.jpg)
Outline
• The Tardis cache coherence protocol • Thousand-core prototype • RISC-V and Tardis
14
![Page 75: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/75.jpg)
Tardis and RISC-V
15Quan Nguyen, “Synchronization in Timestamp-Based Cache Coherence Protocols”, S.M. thesis, MIT, 2016.
![Page 76: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/76.jpg)
Tardis and RISC-V
• RISC-V: clean, extensible, orthogonal, free
15Quan Nguyen, “Synchronization in Timestamp-Based Cache Coherence Protocols”, S.M. thesis, MIT, 2016.
![Page 77: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/77.jpg)
Tardis and RISC-V
• RISC-V: clean, extensible, orthogonal, free• Chisel simplifies extending hardware
15Quan Nguyen, “Synchronization in Timestamp-Based Cache Coherence Protocols”, S.M. thesis, MIT, 2016.
![Page 78: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/78.jpg)
Tardis and RISC-V
• RISC-V: clean, extensible, orthogonal, free• Chisel simplifies extending hardware• Things to consider: – Release consistency – Atomic instructions – Synchronization (see S.M.)
15Quan Nguyen, “Synchronization in Timestamp-Based Cache Coherence Protocols”, S.M. thesis, MIT, 2016.
![Page 79: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/79.jpg)
Type Ordering rule Tardis rule
Consistency model comparison
16<p : program order <s : global memory order <ts : timestamp order
![Page 80: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/80.jpg)
Type Ordering rule Tardis rule
SC X <p Y ⟹ X <s Y X <p Y ⟹ X <ts Y
Consistency model comparison
16<p : program order <s : global memory order <ts : timestamp order
![Page 81: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/81.jpg)
Type Ordering rule Tardis rule
SC X <p Y ⟹ X <s Y X <p Y ⟹ X <ts Y
Consistency model comparison
16<p : program order <s : global memory order <ts : timestamp order
![Page 82: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/82.jpg)
Type Ordering rule Tardis rule
SC X <p Y ⟹ X <s Y X <p Y ⟹ X <ts Y
RC: ordinary memory ops
respect dependencies
respect dependencies
Consistency model comparison
16<p : program order <s : global memory order <ts : timestamp order
![Page 83: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/83.jpg)
Type Ordering rule Tardis rule
SC X <p Y ⟹ X <s Y X <p Y ⟹ X <ts Y
RC: ordinary memory ops
respect dependencies
respect dependencies
RC: acquires acq <p X ⟹ acq <s X acq <p X ⟹ acq <ts X
Consistency model comparison
16<p : program order <s : global memory order <ts : timestamp order
![Page 84: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/84.jpg)
Type Ordering rule Tardis rule
SC X <p Y ⟹ X <s Y X <p Y ⟹ X <ts Y
RC: ordinary memory ops
respect dependencies
respect dependencies
RC: acquires acq <p X ⟹ acq <s X acq <p X ⟹ acq <ts X
RC: releases X <p rel ⟹ X <s rel X <p rel ⟹ X <ts rel
Consistency model comparison
16<p : program order <s : global memory order <ts : timestamp order
![Page 85: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/85.jpg)
Type Ordering rule Tardis rule
SC X <p Y ⟹ X <s Y X <p Y ⟹ X <ts Y
RC: ordinary memory ops
respect dependencies
respect dependencies
RC: acquires acq <p X ⟹ acq <s X acq <p X ⟹ acq <ts X
RC: releases X <p rel ⟹ X <s rel X <p rel ⟹ X <ts rel
RC: sync S ∈ {acq, rel}
SX <p SY ⟹ SX <s SY SX <p SY ⟹ SX <ts SY
Consistency model comparison
16<p : program order <s : global memory order <ts : timestamp order
![Page 86: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/86.jpg)
Release consistency and Tardis
17
![Page 87: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/87.jpg)
Release consistency and Tardis
• tsmin: minimum timestamp for future ops
17
![Page 88: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/88.jpg)
Release consistency and Tardis
• tsmin: minimum timestamp for future ops
• tsmax: maximal timestamp of preceding ops (in timestamp order)
17
![Page 89: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/89.jpg)
Release consistency and Tardis
• tsmin: minimum timestamp for future ops
• tsmax: maximal timestamp of preceding ops (in timestamp order)
• Fences: tsmin ← tsmax
17
![Page 90: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/90.jpg)
Release consistency and Tardis
• tsmin: minimum timestamp for future ops
• tsmax: maximal timestamp of preceding ops (in timestamp order)
• Fences: tsmin ← tsmax
• Track acquires/releases with tsrel
17
![Page 91: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/91.jpg)
Release consistency and Tardis
• tsmin: minimum timestamp for future ops
• tsmax: maximal timestamp of preceding ops (in timestamp order)
• Fences: tsmin ← tsmax
• Track acquires/releases with tsrel
– Release: tsrel ← tsmax
17
![Page 92: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/92.jpg)
Release consistency and Tardis
• tsmin: minimum timestamp for future ops
• tsmax: maximal timestamp of preceding ops (in timestamp order)
• Fences: tsmin ← tsmax
• Track acquires/releases with tsrel
– Release: tsrel ← tsmax
– Acquire: tsmin ← tsrel
17
![Page 93: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/93.jpg)
Load-reserved and store-conditional
18
![Page 94: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/94.jpg)
Load-reserved and store-conditional
• Tardis gives neat solution to LR/SC livelock
18
![Page 95: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/95.jpg)
Load-reserved and store-conditional
• Tardis gives neat solution to LR/SC livelock• wts tracks cache line version
18
![Page 96: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/96.jpg)
Load-reserved and store-conditional
• Tardis gives neat solution to LR/SC livelock• wts tracks cache line version• SC success condition: wtslr == wtsbefore sc
18
![Page 97: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/97.jpg)
19
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
![Page 98: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/98.jpg)
19
• Core 0 performs lr on C
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
![Page 99: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/99.jpg)
19
• Core 0 performs lr on C– exclusive ownership
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
![Page 100: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/100.jpg)
19
• Core 0 performs lr on C– exclusive ownership
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M owner: 0 wts: 0 rts: 0
![Page 101: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/101.jpg)
19
• Core 0 performs lr on C– exclusive ownership
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
![Page 102: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/102.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
![Page 103: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/103.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
![Page 104: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/104.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
![Page 105: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/105.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
C state: S wts: 0 rts: 0
![Page 106: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/106.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: S wts: 0 rts: 0
![Page 107: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/107.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: M wts: 0 rts: 0
C state: S wts: 0 rts: 0
![Page 108: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/108.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
• Core 0 performs sc
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: M wts: 0 rts: 0
C state: S wts: 0 rts: 0
![Page 109: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/109.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
• Core 0 performs sc– core 1 downgraded
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: M wts: 0 rts: 0
C state: S wts: 0 rts: 0
![Page 110: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/110.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
• Core 0 performs sc– core 1 downgraded
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: M wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: S wts: 0 rts: 0
![Page 111: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/111.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
• Core 0 performs sc– core 1 downgraded
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: M wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
![Page 112: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/112.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
• Core 0 performs sc– core 1 downgraded– succeeds; wtslr == wtsC
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: M wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
![Page 113: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/113.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
• Core 0 performs sc– core 1 downgraded– succeeds; wtslr == wtsC
– writes C’ at pts 1
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: M wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
![Page 114: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/114.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
• Core 0 performs sc– core 1 downgraded– succeeds; wtslr == wtsC
– writes C’ at pts 1
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: M wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
C’ state: M wts: 1 rts: 1
![Page 115: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/115.jpg)
19
• Core 0 performs lr on C– exclusive ownership– wtslr = 0
• Core 1 performs lr on C– core 0 downgraded
• Core 0 performs sc– core 1 downgraded– succeeds; wtslr == wtsC
– writes C’ at pts 1
loop: lr.d x1, 0(C) <do stuff to x1> sc.d x2, x1, 0(C) bnez x2, loop
Core 0 pts: 0
C state: I wts: 0 rts: 0
Core 1 pts: 0
C state: I wts: 0 rts: 0
Manager
C state: I owner: wts: 0 rts: 0
LR/SC example
C state: M wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0C state: M owner: 1 wts: 0 rts: 0
C state: M wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: S wts: 0 rts: 0
C state: M owner: 0 wts: 0 rts: 0
C’ state: M wts: 1 rts: 1
pts: 1
![Page 116: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/116.jpg)
Block diagramRocket Core
HellaCache D$Rocket Core
HellaCache D$
Last-level cache
Main memory
TileLink NoC
20
![Page 117: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/117.jpg)
Block diagramRocket Core
HellaCache D$Rocket Core
HellaCache D$
Last-level cache
Main memory
TileLink NoC
20
timestamps
![Page 118: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/118.jpg)
Block diagramRocket Core
HellaCache D$Rocket Core
HellaCache D$
Last-level cache
Main memory
TileLink NoC
20
timestamps
metadata, hit/miss logic
![Page 119: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/119.jpg)
Block diagramRocket Core
HellaCache D$Rocket Core
HellaCache D$
Last-level cache
Main memory
TileLink NoC
20
timestamps
metadata, hit/miss logic
message timestamps
![Page 120: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/120.jpg)
Block diagramRocket Core
HellaCache D$Rocket Core
HellaCache D$
Last-level cache
Main memory
TileLink NoC
20
timestamps
metadata, hit/miss logic
message timestamps
new coherence logic
![Page 121: Towards Thousand-Core RISC-V Shared Memory Systems€¦ · Quan Nguyen (qmn@mit.edu) Massachusetts Institute of Technology 30 November 2016. Outline • The Tardis cache coherence](https://reader035.fdocuments.net/reader035/viewer/2022081409/607b45973cfae208cc299a17/html5/thumbnails/121.jpg)
Thanks!
• Special thanks to Xiangyao Yu and Srini Devadas for their extensive advice and input
21