New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk...

55
Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr´ ed´ eric Loulergue 1 Universit´ e d’Orl´ eans LIFO PaMDA Team October-November 2013 1 and slides by Julien Tesson and Joeffrey Legaux F. Loulergue SyDPaCC – Lecture 6 October-November 2013 1 / 45

Transcript of New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk...

Page 1: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Lecture 6Bulk Synchronous Parallel Homomorphisms

NII Lectures Series

Frederic Loulergue1

Universite d’Orleans – LIFO – PaMDA Team

October-November 2013

1and slides by Julien Tesson and Joeffrey LegauxF. Loulergue SyDPaCC – Lecture 6 October-November 2013 1 / 45

Page 2: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Outline

1 An Introduction to Program Calculation

2 Program Calculation and Parallel Programming

3 Calculating BSP Programs

4 BH Skeletons

5 Summary

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 2 / 45

Page 3: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

References

This lecture is based on the following papers:

[1] L. Gesbert, Z. Hu, F. Loulergue, K. Matsuzaki, and J. Tesson.Systematic Development of Correct Bulk SynchronousParallel Programs. In International Conference on Parallel andDistributed Computing, Applications and Technologies (PDCAT), pages334–340. IEEE, 2010. doi:10.1109/PDCAT.2010.86.

[2] J. Legaux, Z. Hu, F. Loulergue, K. Matsuzaki, and J. Tesson.Programming with BSP Homomorphisms. In F. Wolf, B. Mohr, andD. Ney, editors, Euro-Par 2013 Parallel Processing, number 8097 in LNCS,pages 446–457. Springer, 2013. doi:10.1007/978-3-642-40047-6 46.

[3] F. Loulergue, S. Robillard, J. Tesson, J. Legaux, and Z. Hu. FormalDerivation and Extraction of a Parallel Program for theAll Nearest Smaller Values Problem. In ACM Symposium onApplied Computing (SAC), Gyeongju, Korea, 2014. ACM Press. to appear.

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 3 / 45

Page 4: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Outline

1 An Introduction to Program Calculation

2 Program Calculation and Parallel Programming

3 Calculating BSP Programs

4 BH Skeletons

5 Summary

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 4 / 45

Page 5: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Program calculation

Specificationor naive implementation

Efficient implementation

Program transformationbased on an equational theory

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 5 / 45

Page 6: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

A very simple example

Specification:maximum :: [a] => amaximum = hd ◦ sort

if law : ∀f ,f (if a then b else c) =

if a then f b else f c

maximum (a :: x)

= { def. of maximum }hd (sort (a :: x))

= { property of sort }hd (if a > hd(sort x) then a :: sort x

else hd(sort x) :: insert a (tail(sort x)))= { by if law }

if a > hd(sort x) then hd(a : sort x)else hd(hd(sort x) :: insert a (tail(sort x)))

= { def. of hd }if a > hd(sort x) then a else hd(sort x)

= { def. of maximum }if a > maximum x then a else maximum x

= { define x ↑ y = if x > y then x else y }a ↑ (maximum x)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 6 / 45

Page 7: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

A very simple example

Specification:maximum :: [a] => amaximum = hd ◦ sort

if law : ∀f ,f (if a then b else c) =

if a then f b else f c

maximum (a :: x)= { def. of maximum }

hd (sort (a :: x))

= { property of sort }hd (if a > hd(sort x) then a :: sort x

else hd(sort x) :: insert a (tail(sort x)))= { by if law }

if a > hd(sort x) then hd(a : sort x)else hd(hd(sort x) :: insert a (tail(sort x)))

= { def. of hd }if a > hd(sort x) then a else hd(sort x)

= { def. of maximum }if a > maximum x then a else maximum x

= { define x ↑ y = if x > y then x else y }a ↑ (maximum x)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 6 / 45

Page 8: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

A very simple example

Specification:maximum :: [a] => amaximum = hd ◦ sort

if law : ∀f ,f (if a then b else c) =

if a then f b else f c

maximum (a :: x)= { def. of maximum }

hd (sort (a :: x))= { property of sort }

hd (if a > hd(sort x) then a :: sort xelse hd(sort x) :: insert a (tail(sort x)))

= { by if law }if a > hd(sort x) then hd(a : sort x)else hd(hd(sort x) :: insert a (tail(sort x)))

= { def. of hd }if a > hd(sort x) then a else hd(sort x)

= { def. of maximum }if a > maximum x then a else maximum x

= { define x ↑ y = if x > y then x else y }a ↑ (maximum x)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 6 / 45

Page 9: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

A very simple example

Specification:maximum :: [a] => amaximum = hd ◦ sort

if law : ∀f ,f (if a then b else c) =

if a then f b else f c

maximum (a :: x)= { def. of maximum }

hd (sort (a :: x))= { property of sort }

hd (if a > hd(sort x) then a :: sort xelse hd(sort x) :: insert a (tail(sort x)))

= { by if law }if a > hd(sort x) then hd(a : sort x)else hd(hd(sort x) :: insert a (tail(sort x)))

= { def. of hd }if a > hd(sort x) then a else hd(sort x)

= { def. of maximum }if a > maximum x then a else maximum x

= { define x ↑ y = if x > y then x else y }a ↑ (maximum x)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 6 / 45

Page 10: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

A very simple example

Specification:maximum :: [a] => amaximum = hd ◦ sort

if law : ∀f ,f (if a then b else c) =

if a then f b else f c

maximum (a :: x)= { def. of maximum }

hd (sort (a :: x))= { property of sort }

hd (if a > hd(sort x) then a :: sort xelse hd(sort x) :: insert a (tail(sort x)))

= { by if law }if a > hd(sort x) then hd(a : sort x)else hd(hd(sort x) :: insert a (tail(sort x)))

= { def. of hd }if a > hd(sort x) then a else hd(sort x)

= { def. of maximum }if a > maximum x then a else maximum x

= { define x ↑ y = if x > y then x else y }a ↑ (maximum x)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 6 / 45

Page 11: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

A very simple example

Specification:maximum :: [a] => amaximum = hd ◦ sort

if law : ∀f ,f (if a then b else c) =

if a then f b else f c

maximum (a :: x)= { def. of maximum }

hd (sort (a :: x))= { property of sort }

hd (if a > hd(sort x) then a :: sort xelse hd(sort x) :: insert a (tail(sort x)))

= { by if law }if a > hd(sort x) then hd(a : sort x)else hd(hd(sort x) :: insert a (tail(sort x)))

= { def. of hd }if a > hd(sort x) then a else hd(sort x)

= { def. of maximum }if a > maximum x then a else maximum x

= { define x ↑ y = if x > y then x else y }a ↑ (maximum x)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 6 / 45

Page 12: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

A very simple example

Specification:maximum :: [a] => amaximum = hd ◦ sort

if law : ∀f ,f (if a then b else c) =

if a then f b else f c

maximum (a :: x)= { def. of maximum }

hd (sort (a :: x))= { property of sort }

hd (if a > hd(sort x) then a :: sort xelse hd(sort x) :: insert a (tail(sort x)))

= { by if law }if a > hd(sort x) then hd(a : sort x)else hd(hd(sort x) :: insert a (tail(sort x)))

= { def. of hd }if a > hd(sort x) then a else hd(sort x)

= { def. of maximum }if a > maximum x then a else maximum x

= { define x ↑ y = if x > y then x else y }a ↑ (maximum x)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 6 / 45

Page 13: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Outline

1 An Introduction to Program Calculation

2 Program Calculation and Parallel Programming

3 Calculating BSP Programs

4 BH Skeletons

5 Summary

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 7 / 45

Page 14: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Theory of Lists for Parallel Programming

Homomorphisms

I Join lists can be easily distributed: on p processeurs,

[x0, . . . ; xn−1]

could be seen as:

[x0; . . . ; x np−1] ++ . . .++ [x(n−1) n

p; . . . ; xn−1]

I First Homomorphism Theorem:

Every homomorphism could be written as thecomposition of a reduce and a map

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 8 / 45

Page 15: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Theory of Lists for Parallel Programming

h = hom ⊕ f id⊕

P0 . . . Pi . . . Pp−1

h(

[x0; . . . ; xn0−1] ++ . . . ++ [xni−1 ; . . . ; xni−1] ++ . . . ++ [xnp−2 ; . . . ; xnp−1−1])

= { map phase }⊕ /

([f x0; . . . ; f xn0−1] ++ . . . ++ [f xni−1 ; . . . ; f xni−1] ++ . . . ++ [f xnp−2 ; . . . ; f xnp−1−1]

)= { reduce phase }⊕n0−1

k=0 f xk ⊕ . . .⊕ni−1

k=ni−1f xk ⊕ . . . ⊕ ⊕np−1−1

k=np−2f xk

=⊕np−1−1

k=0 f xk

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 9 / 45

Page 16: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

With Coq

I Lectures 2 and 3 : homomorphism theory in CoqI Lectures 3 and 4 :

I parallel implementations of map and reduce in BSMLI verification of BSML programs in Coq

I Lecture 5 :I support for program calculation in CoqI support for parallel program calculation in Coq

(correspondance with algorithmic skeletons)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 10 / 45

Page 17: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Outline

1 An Introduction to Program Calculation

2 Program Calculation and Parallel Programming

3 Calculating BSP Programs

4 BH Skeletons

5 Summary

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 11 / 45

Page 18: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Calculation of BSP Programs

How to apply program calculation to BSP programs?I What is the relationship between homomorphisms and BSP

algorithms?I In skeletal programming: we use homomorphisms to hide data

communicationI In BSP programming: we want to use homomorphisms to

expose data communication

I How to systematically derive homomorphisms that are suitablefor the BSP model?

Solution:BH specific homomorphisms-like for BSP computation

due to Pr. Zhenjiang Hu

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 12 / 45

Page 19: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

The BSP Homomorphism: Informally

BSP Homomorphism

I inspired by homomorphisms

I adapted to the BSP model:compute, gather needed informations, compute

Sequential semantics

l lst r

gl gr

⊕l ⊗r

k

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 13 / 45

Page 20: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

The BSP Homomorphism: Formally

Definition (BH)

h is a BSP Homomorphism, or BH, if it can be written as:

h [a] l r = [k a l r ]

h (x ++ y) l r = h x l (gr y ⊗r r) ++ h y (l ⊕l gl x) r

where gl and gr are homomorphisms with associated associativeoperators ⊕l and ⊗r

Conditions

I the homomorphisms condition can be weakened

I discussion in Julien Tesson’s PhD thesis

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 14 / 45

Page 21: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Writing Specifications

For writing specifications:

I recursive definitions

I well-known collective operators: map, fold, scan, . . .

I communication operators: shift, permute, . . .

I a new collective operator: mapAround

mapAround

I is to map a function to each element of a list

I is allowed to use information of the sublists in the left andright of the element

mapAround f [x1, x2, . . . , xn] =[ f ([], x1, [x2, . . . , xn]), f ([x1], x2, [x3, . . . , xn]),. . . , f ([x1, x2, . . . , xn−1], xn, []) ].

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 15 / 45

Page 22: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

mapAround is a BSP Homomorphism

Theorem (Parallelisation mapAround with BH)

For a functionh = mapAround f

if we can decompose f as f (ls, x , rs) = k (g1 ls, x , g2 rs), where:

I k is any function,

I gi is a composition of a projection with a homomorphism

then h is a BSP Homomorphism

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 16 / 45

Page 23: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 1: The Tower Building Problem

xL x1 x2 x3 xi xn−1 xn xR

hL h1h2

h3 hi

hn−1 hnhR

· · · · · ·

Specificationtower (xL, hL) (xR , hR) xs = mapAround visibleLR xs

where visibleLR (ls, (xi , hi ), rs) = visibleL ls xi ∧ visibleR rs xivisibleL ls xi = maxAngleL ls < h+hi−hL

x−xLvisibleR rs xi = maxAngleR rs < h+hi−hR

xR−xmaxAngleL [ ] = −∞maxAngleL ([(x , h)] ++ xs) = h−hL

x−xL ↑ maxAngleL xs

and the function maxAngleR can be similarly defined.

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 17 / 45

Page 24: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 2: All Nearest Smaller Values

3 4 3 5 4 2 1 4 ][ 3 4 2 1 4

Foreach data, find the first smaller value at the left and at theright.

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 18 / 45

Page 25: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

ANSV derivation

ansv as = mapAround nsv aswhere

nsv (ls, x , rs) = (nsvL x ls, nsvR x rs)

nsvL x [ ] = −∞nsvL x (ls ++ [l ]) = if l < x then l else nsvL x ls

nsvR x [ ] = −∞nsvR x ([r ] ++ rs) = if r < x then r else nsvR x rs

MapAround parallelisation requirements

nsv is not here in the form k(gl ls, x , gr rs)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 19 / 45

Page 26: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

ANSV derivation

3 [ 4 3 5 4 2 1 4 ]

nsvR v rs = pickupR v (candidatesR rs)

I candidatesR select potential values regardless of the searchedvalue before exchange

I pickupR do select the final value of interest

MapAround parallelisation requirements

nsv = k(candidatesl ls, x , candidatesr rs)

I candidatesR is an homomorphism

I k just applies pickupL and pickupR

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 20 / 45

Page 27: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 3: One Dimensional Heat Diffusion Simulation

I Heat equation:

δu

δt− κδ

2u

δ2x= 0 ∀t, u(0, t) = l ∀t, u(1, t) = r

I κ is the heat diffusivity of the metal,I l and r some constants (the temperature outside the metal)

I A discretised version:

u(x , t+dt) =κdt

dx2×(

u(x+dx , t)+u(x−dx , t)−2×u(x , t))+u(x , t)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 21 / 45

Page 28: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Exemple 4: One Dimensional Heat Diffusion Simulation

I Usual imperative implementation:

for(int i=0; i<n; i++)uu[i] += kappa∗dt/(dx∗dx)∗( (i==(n−1)?r:u[i+1]) +

(i==0?l:u[i−1])− 2∗u[i] );

I A recursive, pure functional implementation:

let rec heatSeq l r dt dx kappa (u : float list) : float list =match u with| [] → []| ui :: u’ →

match u’ with| [] → [ kappa∗.dt/.(dx∗.dx) ∗. ( r +. l −. ui−. ui ) +. ui ]| ui plus1 :: →

(kappa∗.dt/.(dx∗.dx)∗. (ui plus1 +. l −.ui−.ui) +. ui)::(heatSeq ui r dt dx kappa u’)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 22 / 45

Page 29: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Exemple 4: One Dimensional Heat Diffusion Simulation

I Usual imperative implementation:

for(int i=0; i<n; i++)uu[i] += kappa∗dt/(dx∗dx)∗( (i==(n−1)?r:u[i+1]) +

(i==0?l:u[i−1])− 2∗u[i] );

I A recursive, pure functional implementation:

let rec heatSeq l r dt dx kappa (u : float list) : float list =match u with| [] → []| ui :: u’ →

match u’ with| [] → [ kappa∗.dt/.(dx∗.dx) ∗. ( r +. l −. ui−. ui ) +. ui ]| ui plus1 :: →

(kappa∗.dt/.(dx∗.dx)∗. (ui plus1 +. l −.ui−.ui) +. ui)::(heatSeq ui r dt dx kappa u’)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 22 / 45

Page 30: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 3: One Dimensional Heat Diffusion Simulation

(∗ heat: float→ float→ float→ float→ float→ (float list)par→ (float list)par ∗)let heat l r dt dx gamma u =

let leftBounds,rightBounds = getBounds l r u in� heatSeq $leftBounds$ $rightBounds$ dt dx gamma $u$ �

getBounds: float→ float→ (float list)par→ (float par)∗(float par)⟨. . . , . . . , . . . , aj1 . . . ajn , . . . , . . . , . . .

⟩↙ ↘

aj1 ajn

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 23 / 45

Page 31: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 3: One Dimensional Heat Diffusion Simulation

Property of heatSeq

heat [] l r = []

bh [] l r = []

heat [a] l r = [formula a l r]

bh [a] l r = [k a l r]

heat (x + + y) l r = heat x l (hd y r) + +heat y (last x l) r

bh (x + + y) l r = bh x l (gr y ⊗r r) + +bh y (l ⊕l gl x) r

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 24 / 45

Page 32: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 3: One Dimensional Heat Diffusion Simulation

Property of heatSeq

heat [] l r = []bh [] l r = []

heat [a] l r = [formula a l r]bh [a] l r = [k a l r]

heat (x + + y) l r = heat x l (hd y r) + +heat y (last x l) r

bh (x + + y) l r = bh x l (gr y ⊗r r) + +bh y (l ⊕l gl x) r

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 24 / 45

Page 33: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 3: One Dimensional Heat Diffusion Simulation

Property of heatSeq

heat [] l r = []bh [] l r = []

heat [a] l r = [formula a l r]bh [a] l r = [k a l r]

heat (x + + y) l r = heat x l (hd y r) + +heat y (last x l) r

bh (x + + y) l r = bh x l (gr y ⊗r r) + +bh y (l ⊕l gl x) r

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 24 / 45

Page 34: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 3: One Dimensional Heat Diffusion Simulation

Property of heatSeq

heat [] l r = []bh [] l r = []

heat [a] l r = [formula a l r]bh [a] l r = [k a l r]

heat (x + + y) l r = heat x l (hd option y << r) + +heat y (last option x >> l) r

bh (x + + y) l r = bh x l (gr y ⊗r r) + +bh y (l ⊕l gl x) r

with l << r =

{l if l 6= Noner otherwise

BH conditions

I last option (resp. hd option ) is a homomorphism withoperator (>>) (resp. (<<))

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 24 / 45

Page 35: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 4: sparse-matrix vector multiplication I

Sparse matrix : array representation of triples (y , x , a):

I y : the row-index of the nonzero element,

I x : the column-index of the nonzero element, and

I a: the value of the nonzero element.

A =

1.1 2.2 00 1.3 1.40 0 3.5

as = [(0, 0, 1.1), (0, 1, 2.2),(1, 1, 1.3), (1, 2, 1.4), (2, 2, 3.5)]

After multiplication : first element of each row contains thesolution, others contain a dummy value

mult as [3.0, 4.0, 1.0] =[(0, 0, 12.1), (0, 1,2),(1, 1, 6.6), (1, 2,2), (2, 2, 3.5)]

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 25 / 45

Page 36: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 4: sparse-matrix vector multiplication II

Specification with mapAround

What is needed from the left or from the right ?

I element is the first one in the row :compare the row-index with that of the left element

I result value needs:I partial sum of the rightward values in the rowI multiplied by the vector

I from the right:I the row-index of the right elementI the partial sum in the row (of right element)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 26 / 45

Page 37: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 4: sparse-matrix vector multiplication III

mult as v = mapAround (f v) aswhere f v (ls, (y , x , a), rs) =

let yl = gl ls; (yr , sr ) = gr v rsin if (yl == y) then (y , x ,2)

elseif (yr == y) then (y , x , v〈x〉 ∗ a + sr )else (y , x , v〈x〉 ∗ a)

where v〈i〉: i th element of the vector v

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 27 / 45

Page 38: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 4: sparse-matrix vector multiplication IV

Function glTakes the row-index of the last element in a list

gl = ([�, λ(x , y , a).y ]) where a� b = b ,

and any value (here we use −1) is a left unit of the operator �.

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 28 / 45

Page 39: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 4: sparse-matrix vector multiplication V

Function gr v

gr v [(y , x , a)] = (y , a ∗ v〈x〉)gr v [as ++ (y , x , a)] = let (y ′, s) = gr v as

in (y ′, if y ′ == y then s + a ∗ v [x ] else s)

we have:

gr v [(y , x , a)] = (y , a ∗ v〈x〉)gr v (ls ++ rs) =gr v ls � gr v rs

where (yl , sl)� (yr , sr ) = if yl == yr then (yl , sl + sr ) else (yl , sl)

A right unit of the operator � is (−1, 0).

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 29 / 45

Page 40: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Example 4: sparse-matrix vector multiplication VI

mult as v =BH(k v , (([�, λ(y , x , a).(y , a ∗ v〈x〉)])), (([�, λ(x , y , a).y ]))) as

where k v (yl , (y , x , a), (yr , s)) = if y == yl then (y , x ,2)elseif y == yr then (y , x , a ∗ v〈x〉+ s)else (y , x , a ∗ v〈x〉)

a� b = b(yl , sl)� (yr , sr ) = if yl == yr then (yl , sl + sr ) else (yl , sl)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 30 / 45

Page 41: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Outline

1 An Introduction to Program Calculation

2 Program Calculation and Parallel Programming

3 Calculating BSP Programs

4 BH Skeletons

5 Summary

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 31 / 45

Page 42: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

BH Skeleton Implementations

There are two implementations:

I a C++ skeleton for OSL – Orleans Skeleton Library

I a BSML implementation verified in Coq

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 32 / 45

Page 43: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Parallel implementation

0 i p − 1

BHseq BHseq BHseq

BH Steps

SummariesComputationusing gl and gr

communications

Summaries fusionusing ⊕l and ⊗r

Local BHcomputation

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 33 / 45

Page 44: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

OSL

Orleans Skeleton Library

I C++ algorithmic skeletons library

I currently implemented on top of MPI

I follows the BSP model

Parallel data structures : distributed arrays

Template: DArray<A>

I computation skeletons : map, zip, reduce, . . .

I Communication skeletons : shift, permute, . . .

I Distribution management skeletons : redistribute,getPartition, flatten

I Expression template mechanism : loop fusion at compile time

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 34 / 45

Page 45: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Loop fusion

Calculating 2 * A + (B * C) on arrays

Simple function calls

A B C

zip(*)

zip(+)

map(*2)

=

res

4 loops, 3 temporaries

Construction of complex typeat compilation

A B C

Zip<F, Exp1, Exp2>

Map<F, Exp>

Zip<F, Exp1, Exp2>

A single loop

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 35 / 45

Page 46: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

BH OSL

Skeleton signature

DArray<typename K::result_type>

bh(K k, Homomorphism<T, L> * hl, Homomorphism<T, R> * hr

,

L l, R r, const DArray<T>& temp);

Homomorphism example

class HAdd: public Homomorphism<int, int> {

public:

HAdd() { neutral = 0;}

inline int f(const int& i) {return i;}

inline int o(const int& i1, const int& i2) {return i1

+i2;}

};

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 36 / 45

Page 47: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

BH OSL

BH applies to

I distributed arrays

I skeleton expressions that produce arrays

I triggers the fusion mechanism when relevant

Efficient implementation

I local applications of the homomorphisms (linear time)

I global exchange of the local summaries

I local applications of the main function (linear time)

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 37 / 45

Page 48: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

ANSV with BH OSL

class candidatesL:

public Homomorphism <int, std::vector<int> >

// ..

class candidatesR:

public Homomorphism<int, std::vector<int> >

// ..

std::vector<int> empty(0);

DArray<int> input = // ...

DArray<std::pair<int, int> > result =

osl::bh(new candidatesL(), new candidatesR(),

empty, empty, input);

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 38 / 45

Page 49: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

BH in Coq I

BH Skeleton in BSML

I Formalisation of BH definitionI Computational definitions of BH & proofs of equivalence

I sequential very inefficientI sequentialI parallelI sequential optimisedI parallel optimised

I extraction of the BSML implementation of BH

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 39 / 45

Page 50: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

BH in Coq II

About specifying programs

I Proof of the correctness of BSML versions of communicationoperators (shifts, permute)

I Formalisation of mapAround

I Proof that mapAround is a BH

I Proof that any homomorphism is a BH

I Formalisation of what does it means for a sequential functionto be parallelisable⇒ composition of derivations, communication operators

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 40 / 45

Page 51: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Demonstration

Using SDPP version nii2013http://traclifo.univ-orleans.fr/SDPP

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 41 / 45

Page 52: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Experiments: The Tower Building Problem

Extraction from Coq

0

5

10

15

20

25

30

35

40

0 5 10 15 20 25 30 35 40processors

extracted from coq

3333

3

3

3333

3333

33

3

3333

3333

33

3

33

333

333

3direct implementation

+++++

+

++++++

++++

+

++++++++++

+

++

++

++++

+f(x)=x

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 42 / 45

Page 53: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Experiments: ANSV & Matrix multiplication

Hand-written OSL versions

1

4

8

16

24

32

48

1 4 8 16 24 32 48

Speedup

Number of cores

Ideal curveANSV

Sparse Matrix-Vector Multiplication

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 43 / 45

Page 54: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Outline

1 An Introduction to Program Calculation

2 Program Calculation and Parallel Programming

3 Calculating BSP Programs

4 BH Skeletons

5 Summary

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 44 / 45

Page 55: New Lecture 6 Bulk Synchronous Parallel Homomorphisms · 2017. 4. 26. · Lecture 6 Bulk Synchronous Parallel Homomorphisms NII Lectures Series Fr ed eric Loulergue1 Universit e d’Orl

Summary

I Program calculation is useful for developping parallelprograms in a systematic way

I BSP Homomorphisms are dedicated to bulk synchronousparallel program calculations

I We provide Coq support for program calculation and bulksynchronous parallel program calculation

I BH skeletons exist as:I part of the C++ Orleans Skeleton LibraryI a parallel function in BSML, verified in Coq

I Applications using BSP homomorphisms (and Coq):I tower buildingI array packingI all nearest smaller valuesI 1D heat equation

F. Loulergue SyDPaCC – Lecture 6 October-November 2013 45 / 45