Work–Stealing.by.Stealing.State.from.....
Transcript of Work–Stealing.by.Stealing.State.from.....
![Page 1: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/1.jpg)
Work–Stealing by Stealing State from Live Stack Frames of a Running Applica<on
Generously supported by IBM & the Australian Research Council 1
Stephen M. Blackburn
Australian Na?onal University
Vivek Kumar
Australian Na<onal University
Daniel Frampton
Australian Na?onal University
Olivier Tardieu IBM T.J. Watson Research Center
David Grove IBM T.J. Watson Research Center
![Page 2: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/2.jpg)
Motivation
• Multicore era – Dynamic task parallelism
2 Kumar et al.
– Work–sharing • Central task queue • Scalability bottleneck with increase in threads
– Work–stealing • Fixed number of threads • One task queue per thread
• Load balancing
![Page 3: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/3.jpg)
3 Kumar et al. 3
OUT IN
Work–Stealing
A
![Page 4: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/4.jpg)
3 Kumar et al.
Work–Stealing
A
![Page 5: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/5.jpg)
IN OUT
3 Kumar et al.
Work–Stealing
A B
![Page 6: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/6.jpg)
VICTIM
OUT IN IN OUT
3 Kumar et al.
Work–Stealing
A B
![Page 7: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/7.jpg)
A B C D E F G H I J K L M N O
IN OUT
4 Kumar et al.
A B C D E F G H I J K L M N O
Overheads
A B C D E F G H I J K L M N O
A B C D E F G
H I J K L M N O
VICTIM THIEF
Coordination
Enough context provided ??
Control Flow
A B
A
![Page 8: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/8.jpg)
5
X10 (2.0) Fib (40) – Single Thread Work-Stealing Execution Time Normalized with Sequential Execution Time
Tim
e
Kumar et al.
Overheads
Overheads
![Page 9: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/9.jpg)
5
X10 (2.0) Fib (40) – Single Thread Work-Stealing Execution Time Normalized with Sequential Execution Time
Tim
e
Kumar et al.
Overheads
Control Flow
![Page 10: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/10.jpg)
public static def main() { val n = 20; val a:int; val b:int; finish { async{ a = A(n);} b = B(n-1); } val result = a + b; }
Overhead – Control Flow final static class _$main extends MainFrame { public def fast(worker:Worker):void { this.n = 20; this._pc = 1; val tmp:_$mainF0 = new _$mainF0(this); tmp.fast(worker); this.result = this.a + this.b } public def resume(worker:Worker):void { switch (this._pc) { case 1: this.result = this.a + this.b; } } public def back(workerWorker,frame:Frame):void {} }
final static class _$mainF0 extends FinishFrame { public def fast(worker:Worker):void { val tmp = new _$mainF0A0(ff,ff); tmp.fast(worker); } public def resume(worker:Worker):void { public def back(worker:Worker,frame:Frame):void {} }
final static class _$mainF0A0 extends RegularFrame { public def fast(worker:Worker):void { this._pc = 1; push(worker); val tmp:_$mainF0A0B0 = new _$mainF0A0B0(ff); tmp.fast(worker); _$main.b = B(_$main.n - 1); } public def resume(worker:Worker):void { switch (this._pc) { case 1: _$main.b = B(_$main.n - 1); } } public def back(workerWorker,frame:Frame):void {} }
final static class _$mainF0B0A0 extends AsyncFrame { public def fast(worker:Worker):void { _$main.a = A(_$main.n); poll(worker); } public def resume(worker:Worker):void { public def back(workerWorker,frame:Frame):void {} }
Steal Point
![Page 11: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/11.jpg)
5
X10 (2.0) Fib (40) – Single Thread Work-Stealing Execution Time Normalized with Sequential Execution Time
Tim
e
Kumar et al.
Overheads
Control Flow
Providing Contexts
![Page 12: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/12.jpg)
public static def main() { val n = 20; val a:int; val b:int; finish { async{ a = A(n);} b = B(n-1); } val result = a + b; }
Overhead – Providing Contexts final static class _$main extends MainFrame { public def fast(worker:Worker):void {
this.n = 20; this._pc = 1;
val tmp:_$mainF0 = new _$mainF0(this); tmp.fast(worker); this.result = this.a + this.b } public def resume(worker:Worker):void { switch (this._pc) { case 1: this.result = this.a + this.b; } } public def back(workerWorker,frame:Frame):void {} }
final static class _$mainF0 extends FinishFrame { public def fast(worker:Worker):void {
val tmp = new _$mainF0A0(ff,ff); tmp.fast(worker); } public def resume(worker:Worker):void { public def back(worker:Worker,frame:Frame):void {} }
final static class _$mainF0A0 extends RegularFrame { public def fast(worker:Worker):void { this._pc = 1; push(worker);
val tmp:_$mainF0A0B0 = new _$mainF0A0B0(ff); tmp.fast(worker); _$main.b = B(_$main.n - 1); } public def resume(worker:Worker):void { switch (this._pc) { case 1: _$main.b = B(_$main.n - 1); } } public def back(workerWorker,frame:Frame):void {} }
final static class _$mainF0B0A0 extends AsyncFrame { public def fast(worker:Worker):void { _$main.a = A(_$main.n); poll(worker); } public def resume(worker:Worker):void { public def back(workerWorker,frame:Frame):void {} }
Steal Point
![Page 13: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/13.jpg)
5
X10 (2.0) Fib (40) – Single Thread Work-Stealing Execution Time Normalized with Sequential Execution Time
Tim
e
Kumar et al.
Overheads
Control Flow
Providing Contexts
Coordination Effort
![Page 14: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/14.jpg)
public static def main() { val n = 20; val a:int; val b:int; finish { async{ a = A(n);} b = B(n-1); } val result = a + b; }
Overhead – Coordination Effort final static class _$main extends MainFrame { public def fast(worker:Worker):void { this.n = 20; this._pc = 1; val tmp:_$mainF0 = new _$mainF0(this); tmp.fast(worker); this.result = this.a + this.b } public def resume(worker:Worker):void { switch (this._pc) { case 1: this.result = this.a + this.b; } } public def back(workerWorker,frame:Frame):void {} }
final static class _$mainF0 extends FinishFrame { public def fast(worker:Worker):void { val tmp = new _$mainF0A0(ff,ff); tmp.fast(worker); } public def resume(worker:Worker):void { public def back(worker:Worker,frame:Frame):void {} }
final static class _$mainF0A0 extends RegularFrame { public def fast(worker:Worker):void { this._pc = 1;
push(worker); val tmp:_$mainF0A0B0 = new _$mainF0A0B0(ff); tmp.fast(worker); _$main.b = B(_$main.n - 1); } public def resume(worker:Worker):void { switch (this._pc) { case 1: _$main.b = B(_$main.n - 1); } } public def back(workerWorker,frame:Frame):void {} }
final static class _$mainF0B0A0 extends AsyncFrame { public def fast(worker:Worker):void { _$main.a = A(_$main.n);
poll(worker); } public def resume(worker:Worker):void { public def back(workerWorker,frame:Frame):void {} }
Steal Point
![Page 15: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/15.jpg)
6 Kumar et al.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
2 3 4 5 6 7 8
30 35 40
Ratio of Total Continuations Stolen to Total Continuations Produced in X10 (2.0) Fibonacci Benchmark
Ste
al R
atio
Steal Ratio
![Page 16: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/16.jpg)
6 Kumar et al.
Ratio of Total Continuations Stolen to Total Continuations Produced in X10 (2.0) Fibonacci Benchmark
Ste
al R
atio
Steal Ratio
Log Scale
![Page 17: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/17.jpg)
• Our philosophy: – Small steal ratio
Saving contexts for every continuation inefficient
– Provide contexts only when steal occurs
7 Kumar et al.
VM Supported Work–Stealing
• Our approach: – Thief steals victim’s Java stack frame – Thief forces the victim to yield to start
the steal
![Page 18: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/18.jpg)
• Stealing the Stack Frames
Fast_A() Fast_B() Fast_C() Fast_D() Fast_E() Fast_G()
run()
Fast_H() Fast_I()
run() FP_D FP_G FP_I
Fast_A() Fast_B() Fast_C()
Thread-0 Thread-1 Deque
Java
Sta
ck G
row
th
VM Assistance
8 Kumar et al.
![Page 19: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/19.jpg)
• Stealing the Stack Frames
Fast_A() Fast_B() Fast_C() Fast_D() Fast_E() Fast_G()
run()
Fast_H() Fast_I()
run() Initiate()
FP_D FP_G FP_I
Steal_FP() Fast_A() Fast_B() Fast_C()
Thread-0 Thread-1 Deque
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 20: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/20.jpg)
• Stealing the Stack Frames
Fast_A() Fast_B() Fast_C() Fast_D() Fast_E() Fast_G()
run()
Fast_H() Fast_I()
run() Initiate() Steal()
FP_G FP_I
Steal_FP()
run() Initiate() Steal()
Fast_A() Fast_B() Fast_C()
Thread-0 Thread-1 Deque
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 21: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/21.jpg)
• Stealing the Stack Frames
Fast_A() Fast_B() Fast_C() Fast_D() Fast_E() Fast_G()
run()
Fast_H() Fast_I()
run() Initiate()
Steal()
Fast_A Fast_ B Fast_C Fast_D
run() Initiate() Steal()
FP_G FP_I
Steal_FP()
run() Initiate() Steal()
Fast_A() Fast_B() Fast_C()
Thread-0 Thread-1 Thread-1 Deque
Java
Sta
ck G
row
th
VM Assistance
Kumar et al.
New Stack
8
![Page 22: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/22.jpg)
• Stealing the Stack Frames
Fast_A() Fast_B() Fast_C() Fast_D() Fast_E() Fast_G()
run()
Fast_H() Fast_I()
run() Initiate()
Steal()
Fast_A Fast_ B Fast_C Fast_D
FP_G FP_I
Fast_A() Fast_B() Fast_C()
Thread-0 Thread-1 Deque
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 23: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/23.jpg)
• Stealing the Stack Frames
Fast_A() Fast_B() Fast_C() Fast_D() Fast_E() Fast_G()
run()
Fast_H() Fast_I()
run() Initiate() FP_G
FP_I Fast_A() Fast_B() Fast_C()
Thread-0 Thread-1 Deque
Java
Sta
ck G
row
th
VM Assistance
Kumar et al.
States_A States_D States_C States_B
8
![Page 24: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/24.jpg)
• Stealing the Stack Frames
Fast_A() Fast_B() Fast_C() Fast_D() Fast_E() Fast_G()
run()
Fast_H() Fast_I()
run() Initiate() Fast_A Fast_ B Fast_C Fast_D
Fast_D Fast_J Fast_K Fast_L Fast_M
States_A States_D States_C States_B
FP_G FP_I
Fast_A() Fast_B() Fast_C()
Thread-0 Thread-1 Deque
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 25: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/25.jpg)
• Stealing the Stack Frames
Fast_A() Fast_B() Fast_C() Fast_D() Fast_E() Fast_G()
run()
Fast_H() Fast_I()
run() Initiate() Fast_D
States_A States_D States_C States_B
FP_G FP_I
Fast_A() Fast_B() Fast_C()
Thread-0 Thread-1 Deque
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 26: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/26.jpg)
• Stealing the Stack Frames
Fast_A() Fast_B() Fast_C() Fast_D()
run() run() Initiate() Fast_D
States_A States_D States_C States_B
Fast_A() Fast_B() Fast_C()
Thread-0 Thread-1
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 27: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/27.jpg)
• Stealing the Stack Frames
run() run() Initiate() Fast_D
States_A States_D States_C States_B
Thread-0 Thread-1
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 28: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/28.jpg)
• Stealing the Stack Frames
run() run() Initiate() Fast_A Fast_ B Fast_C Fast_D
Fast_D Fast_J Fast_K Fast_L Fast_M
States_A States_D States_C States_B
Thread-0 Thread-1
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 29: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/29.jpg)
• Stealing the Stack Frames
run() run() Initiate() Fast_D
States_A States_D States_C States_B
Thread-0 Thread-1
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 30: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/30.jpg)
• Stealing the Stack Frames
run() run() Initiate() Fast_D Slow_D
States_A States_C States_B
Thread-0 Thread-1
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 31: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/31.jpg)
• Stealing the Stack Frames
run() run() Initiate() Fast_A Fast_D Fast_J Slow_D Slow_C
States_A States_B
Thread-0 Thread-1
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 32: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/32.jpg)
• Stealing the Stack Frames
run() run() Initiate() Fast_A Fast_ B
Fast_D Fast_J Fast_K
Slow_D Slow_C Slow_B
States_A
Thread-0 Thread-1
Java
Sta
ck G
row
th
VM Assistance
Kumar et al. 8
![Page 33: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/33.jpg)
• Stealing the Stack Frames
run() run() Initiate() Fast_A Fast_ B Fast_C
Fast_D Fast_J Fast_K Fast_L
Slow_D Slow_C Slow_B Slow_A
Thread-0 Thread-1
Java
Sta
ck G
row
th
VM Assistance
Kumar et al.
Calculation Finished
8
![Page 34: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/34.jpg)
9
a) Integrate (1000) b) Fib (40)
Tim
e
Kumar et al.
Experimental Results – Execution Time
Total Threads Total Threads
Default WS
VM Supported WS
Execution Time Normalized With Single Thread Default Work-Stealing Time
![Page 35: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/35.jpg)
10
Spe
edup
Kumar et al.
Experimental Results – Speedup Default WS
VM Supported WS
a) Integrate (1000) b) Fib (40) Total Threads Total Threads
Speedup Relative to Single Thread Default Work-Stealing Time
![Page 36: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/36.jpg)
11 Kumar et al.
• Multicore era – Dynamic task parallelism.
• Load balancing – Work–stealing schedulers
Summary
Overheads = Control Flow
Providing Contexts
Coordination Effort + +
![Page 37: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/37.jpg)
12 Kumar et al.
• Multicore era – Dynamic task parallelism.
• Load balancing – Work–stealing schedulers
Overheads = Control Flow
Providing Contexts
Coordination Effort + +
Future Work
Test with high steal ratio benchmarks
Research new VM extensions to make X10 run faster.
![Page 38: Work–Stealing.by.Stealing.State.from.. …x10.sourceforge.net/documentation/papers/X10Workshop2011/...Motivation • Multicore era – Dynamic task parallelism Kumar*etal.* 2 –](https://reader034.fdocuments.net/reader034/viewer/2022050302/5f6b3f1b0da5526b5a7b03c3/html5/thumbnails/38.jpg)
13 Kumar et al.
Questions …..?