EPFL-PRES 24.02.11 Willy Zwaenepoel The legacy of a great Dean Dean School I&C.
Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A....
-
Upload
kelley-parks -
Category
Documents
-
view
215 -
download
0
Transcript of Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A....
![Page 1: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/1.jpg)
Treadmarks: Distributed Shared Treadmarks: Distributed Shared Memory on Standard Memory on Standard
Workstations and Operating Workstations and Operating SystemsSystems
P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix Conference
1994
2008-22952Jun Lee
![Page 2: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/2.jpg)
DSM (distributed shared DSM (distributed shared memory)memory) A software system for parallel
computation• Shares distributed memories• Easier programming
−Provide a single global address space
![Page 3: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/3.jpg)
DSM (distributed shared DSM (distributed shared memory)memory) No widely available DSM implementations• In-house research platforms• Kernel modifications• Poor performance
−Imitating consistency protocols of hardware−False sharing
![Page 4: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/4.jpg)
TreadmarksTreadmarks
Objectives• Commercially available workstations and
OS−Standard Unix system on DECstation
• Efficient user-level DSM implementation−Reduce communication overhead
Design• LRC (lazy release consistency)• Multiple writer protocol• Lazy diff creation
![Page 5: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/5.jpg)
Consistency protocol (SC)Consistency protocol (SC)
Sequential Consistency• Every write visible “immediately”• Single writer
P0 P1
R(a):0
W(a):1
R(a):1
![Page 6: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/6.jpg)
Consistency protocol (SC)Consistency protocol (SC)
Sequential Consistency• Every write visible “immediately”• Single writer
P0 P1
R(a):0
W(a):1
R(a):?
R(a):1
Big problem with page size granularity
![Page 7: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/7.jpg)
Page X
Consistency protocol (SC)Consistency protocol (SC)
Sequential Consistency• Every write visible “immediately”• Single writer
W(x0):a W(x1):b a
W(x2):c
W(x3):d
P0 P1
a
Page X
bb c c d
False sharing
![Page 8: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/8.jpg)
Consistency protocol (RC)Consistency protocol (RC)
Release Consistency• Relaxed memory consistency model
−delay making its changes visible to other processors until certain synchronization accesses occurs
• Synchronization points−Acquire(), Release() (similar to locks,
barriers)
• Two types−ERC (eager), LRC (lazy)
![Page 9: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/9.jpg)
Consistency protocol (RC)Consistency protocol (RC)
Release Consistency• Acquire() and release() are sequentially
consistent−Release() is performed after all previous
operations have completed−Operations are performed after previous
acquire() have been performed
• Acquire() and release() pair between conflicting accesses−SC and RC produce the same results.
![Page 10: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/10.jpg)
Consistency protocol (RC)Consistency protocol (RC)
ERC• Write information is delivered at the
release pointP0 P1
Acquire(L)
R(a):0
W(a):1
Release(L)
Acquire(L)
R(a):?
Release(L)
R(a):1
Write Notice
![Page 11: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/11.jpg)
Consistency protocol (RC)Consistency protocol (RC)
ERC• Write information is delivered at the
release pointP0 P1
Acquire(L)
R(a):0
W(a):1
Release(L)
Acquire(L)
R(a):1
Release(L)
Acquire(K)
Release(K)
![Page 12: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/12.jpg)
Consistency protocol (RC)Consistency protocol (RC)
LRC• The delivery is postponed until the
acquire• Fewer messages than ERCP0 P1
Acquire(L)
R(a):0W(a):1
Release(L)Acquire(L)
R(a):?
Release(L)
R(a):1
![Page 13: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/13.jpg)
Consistency protocol (RC)Consistency protocol (RC)
ERC vs. LRC
P0 P1
Acquire(L) R(a):0
W(a):1
Release(L)
P2
R(a):0
P3
R(a):0
Acquire(L)
Release(L)
R(a):1
ERC
![Page 14: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/14.jpg)
Consistency protocol (RC)Consistency protocol (RC)
ERC vs. LRC
P0 P1
Acquire(L) R(a):0
W(a):1
Release(L)
P2
R(a):0
P3
R(a):0
Acquire(L)
Release(L)
R(a):1
ERC
P0 P1
Acquire(L) R(a):0
W(a):1
Release(L)
P2
R(a):0
P3
R(a):0
Acquire(L)
Release(L)
R(a):1
LRC
![Page 15: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/15.jpg)
Page X
Multiple writer protocolMultiple writer protocol
Page X
W(x0):a W(x1):b
W(x2):c
W(x3):?
P0 P1
a bc
Acquire(L)
Release(L)
Acquire(L)
a c
W(x0):a W(x1):b
W(x2):c
W(x3):d
P0 P1a
Page X
bc
Page X
abcd
False sharing
![Page 16: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/16.jpg)
Page X
Multiple writer protocolMultiple writer protocol
Page X
W(x0):a W(x1):b
W(x2):c
W(x3):?
P0 P1
a bc d
W(x0):a W(x1):b
W(x2):c
W(x3):d
P0 P1a
Page X
bc
Page X
abcd
False sharing
Acquire(L)
Release(L)
Acquire(L)
Release(L)
W(x3):d
a c
![Page 17: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/17.jpg)
Page X
Twin and DiffTwin and Diff
Page X
W(x0):a W(x1):b
W(x2):c
W(x3):?
P0 P1
a bc d
Acquire(L)
Release(L)
Acquire(L)
Release(L)
W(x3):d
Twin X Twin XDiff
a c diff
a c
![Page 18: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/18.jpg)
Page X
Twin and DiffTwin and Diff
Page X
W(x0):a W(x1):b
W(x2):c
W(x3):?
P0 P1
a bc d
Acquire(L)
Release(L)
Acquire(L)
Release(L)
W(x3):d
Twin X Twin XDiff
a c diff
a c
Twin X
ba cDiff
b diff
a c diffinterval
![Page 19: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/19.jpg)
ImplementationImplementation
![Page 20: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/20.jpg)
Etc.Etc.
Lock & barrier• Statically assigned manager
Garbage collection• reclaim the space used by write notice
records, interval records, and diffs• Triggered when the free space drops
below a threshold
![Page 21: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/21.jpg)
EvaluationEvaluation
Experimental Environment• 8 DECstation-5000/240• connected to a 100-Mbps ATM LAN and a 10-
Mbps Ethernet
Applications• Water – molecular dynamics simulation • Jacobi – Successive Over-Relaxation• TSP – branch & bound algorithm to solve the
traveling salesman problem• Quicksort – using bubblesort to sort subarray of
less than 1K element• ILINK – genetic linkage analysis
![Page 22: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/22.jpg)
EvaluationEvaluation
Speedup
Execution statistics
![Page 23: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/23.jpg)
EvaluationEvaluation
Execution time breakdown
![Page 24: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/24.jpg)
EvaluationEvaluation
Unix overhead breakdown TreadMarks overhead breakdown
![Page 25: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/25.jpg)
EvaluationEvaluation
Execution time breakdown for Water
![Page 26: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/26.jpg)
EvaluationEvaluationERC vs. LRC
Speedup Message rate
![Page 27: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/27.jpg)
EvaluationEvaluationERC vs. LRC
Data rate Diff creation rate
![Page 28: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/28.jpg)
ImplementationImplementation
. . .
. . .
. . .
pages
time stamp0 0 0
. . .
. . .
. . .
pages
time stamp0 0 0
Acq(L)
P0 side P1 side
P P
![Page 29: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/29.jpg)
ImplementationImplementation
. . .
. . .
. . .
pages
time stamp1 0 0
. . .
. . .
. . .
pages
time stamp0 0 0
Acq(L)
P0 side P1 side
W(a)
Rel(L)a W(b) b
twin twin
P0W.N
P1W.N
P P
![Page 30: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/30.jpg)
ImplementationImplementation
. . .
. . .
. . .
pages
time stamp2 0 0
. . .
. . .
. . .
pages
time stamp0 0 0
Acq(L)
P0 side P1 side
W(a)
Rel(L)a
Acq(L)
W(b) b
twin twin
P0W.N
P1W.N
![Page 31: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/31.jpg)
ImplementationImplementation
. . .
. . .
. . .
pages
time stamp2 0 0
. . .
. . .
. . .
pages
time stamp0 0 0
Acq(L)
P0 side P1 side
W(a)
Rel(L)a
Acq(L)
W(b) b
twin twin
P0W.N
P0
1 0 0
P1W.N
![Page 32: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/32.jpg)
ImplementationImplementation
. . .
. . .
. . .
pages
time stamp2 0 0
. . .
. . .
. . .
pages
time stamp1 1 0
Acq(L)
P0 side P1 side
W(a)
Rel(L)a
Acq(L)
W(c)
W(b) b
twin twin
P0W.N
P0W.N
P1
bdiff
adiff
P
a
W.N
P0 P1
1 0 0
![Page 33: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/33.jpg)
ImplementationImplementation
. . .
. . .
. . .
pages
time stamp2 0 0
Acq(L)
P0 side P1 side
W(a)
Rel(L)a
Acq(L)
W(c)
Rel(L)
W(b)P0W.N
adiff
P
. . .
. . .
. . .
pages
time stamp1 1 0
b P0W.N
P1
bdiff
adiff
aW.N
P0 P1
1 0 0
c P1W.N
P
twinba
![Page 34: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/34.jpg)
ImplementationImplementation
. . .
. . .
. . .
pages
time stamp2 0 0
Acq(L)
P0 side P1 side
W(a)
Rel(L)a
Acq(L)
W(c)
Rel(L)
W(b)P0W.N
adiff
P
. . .
. . .
. . .
pages
time stamp1 2 0
b P0W.N
P1
bdiff
adiff
aW.N
P0 P1
1 0 0
c P1W.N
twinba
![Page 35: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/35.jpg)
ImplementationImplementationP1 side P2 side
P0 P1
1 0 0
. . .
. . .
. . .
pages
time stamp0 0 0
P
Acq(L). . .
. . .
. . .
pages
time stamp1 2 0
b P0W.N
P1
bdiff
adiff
aW.N
c P1W.N
twinba
0 0 01 1 0
![Page 36: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/36.jpg)
ImplementationImplementationP1 side P2 side
P0 P1
1 0 0
. . .
. . .
. . .
pages
time stamp1 1 1
P
Acq(L). . .
. . .
. . .
pages
time stamp1 2 0
b P0W.N
P1
bdiff
adiff
aW.N
c P1W.N
twinba
0 0 01 1 0
P0W.N
P1W.N
P1W.N
![Page 37: Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.](https://reader030.fdocuments.net/reader030/viewer/2022032722/56649f3e5503460f94c5f430/html5/thumbnails/37.jpg)
Thank you !Thank you !Any questions?