Early Global Program Optimizations Chapter 12.4-6 Mooly Sagiv.

Early Global Program Optimizations Chapter 12.4-6

Mooly Sagiv

Outline• Value Numbering

– Basic blocks– Procedure based data flow analysis

• Sparse conditional constant propagation

Three Global Data Flow Problems

• Value-Numbers Assign integer values to expressions– e1 and e2 have the same value (at the entry a given block) e1 and

e2 evaluate to the same value every time the control reaches this block

• Constant-Propagation Assign integer values to program variables – v has a constant value n (at the entry of a given block) every

time the control reaches this block, has the value n

• (Formal) Common Available Sub-expression Evaluation Assign sets of available expressions– e is available (at the entry of a given block) every time the

control reaches this block, e is already computed

Example

read(i)

j i + 1

k i

l k + 1

i 2

j i * 2

k i + 2

read(i)

if i > 0 goto L1

j 2 * i

goto L2

L1: k 2 * i

L2: l 2 * i

Solutions

• General– Develop the most precise algorithm that

approximates the three problems together

• Specific– Develop the most efficient algorithm that

approximates one problem– Apply several algorithms

(multiple times)

The General Algorithm(Killdall 1973)

• An iterative algorithm

• The Lattice of data flow information =“pools” of sets of equivalent expressions with an optional constant value

• Optimistically start with everything being available and iterate until no more expressions/values are removed

{}

read(i)

{{i}}

j i + 1

{{i}, {i+1, j}}

k i

{{i, k}, {i+1, k+1, j}}

l k + 1

{{i, k}, {i+1, k+1, j, l}}

{}

i 2

{{i, 2}}

j i * 2

{{i, 2}, {j, i *2, 4}}

k i + 2

{{i, 2}, {j, i *2, i+2, 4, k}}

{}

read(i)

{{i}}

if i > 0 goto L1

{{i}}

j 2 * i

{{i}, {2*i, j}}

goto L2

L1: {{i}}

k 2 * i

{{i}, {2*i, k}}

L2: {{i}, {2*i}}

l 2 * i

{{i}, {2*i, l}}

The General Algorithm(Killdall 1973)

• The Lattice of data flow information– Pool = Sets of sets of equivalent expressions– p Pool, x, y p x = y xy = {}– p Pool, x p, z1, z2 x z1=z2

• p1 p2 p2 is a refinement of p1 x2 p2: x1 p1: x2 x1

• p1 p1 = {x1 x2 - {}: x1 p1, x2 p2} =• Init = {{}}

The effect of st==x eon a pool P

• If e is in P e is redundant

• Create a new class P with the partial computations in the program which have operands equivalent to e

• If e evaluates to a constant z, then add e to the (possibly new) class of z

• Remove all the expressions with argument x and add x to the class of e (and all its consequences)

{}

read(i)

{{i}}

j i + 1

{{i}, {i+1, j}}

k i

{{i, k}, {i+1, k+1, j}}

l k + 1

{{i, k}, {i+1, k+1, j, l}}

{}

i 2

{{i, 2}}

j i * 2

{{i, 2}, {j, i *2, 4}}

k i + 2

{{i, 2}, {j, i *2, i+2, 4, k}}

{}

read(i)

{{i}}

if i > 0 goto L1

{{i}}

j 2 * i

{{i}, {2*i, j}}

goto L2

L1: {{i}}

k 2 * i

{{i}, {2*i, k}}

L2: {{i}, {2*i}}

l 2 * i

{{i}, {2*i, l}}

Efficient Algorithms for Value Numbering

• For basic blocks the problem is easy (12.4.1)• SSA can be used for extended basic blocks• Reif & Lewis 1982 O(E (E, E)) algorithm• Bowen, Wegman, Zadeck (1988)

O(E log E)• Extensions for more constants- Knoop,

Steffen, Ruething 1999

Bowen, Wegman, Zadeck Algorithm

• Convert the program into SSA form

• Build a “Value Graph” --- a directed graph representing symbolic execution of the program

• Find the “congruent” nodes using the coerset partition of a set (automata minimization)

• Variables are detected as equivalent at a basic block if (i) the corresponding nodes are equivalent (ii) their defining assignment dominates p

A Simple Example

I<29

J 1

K 1

J 2

K 2

I<29

Y N

L 1 L 2

Y N

A Simple Example (Killdall’s Algorithm)

I<29

J 1

K 1

J 2

K 2

I<29Y

N

L 1 L 2

{{}}

{{I<29}} {{I<29}}

{{I<29}{J, K, 1}} {{I<29},{J, K, 2}}

{{I<29},{J, K}}

{{I<29},{J, K}} {{I<29},{J, K}}

{{I<29},{J, K}, {L, 1}} {{I<29},{J, K}, {L, 2}}

{{I<29},{J, K}, {L}}

Y N

N

A Simple Example(Bowen, Wegman, Zadeck)

I<29

J1 1

K1 1

J2 2

K2 2

J3 4 (J1, J2)

K3 4 (K1, K2)

, I<29

Y N

L1 1 L2 2Y N

1

2 3

4

5 6

7 L3 7(L1, L2)

1J1

2J2

4

1K1

2K2

4

1J1

2J2

7

I 29

<

I 29

<

J3K3l

l

l l

lrr

r

rr

Global Value Graph

• Directed labeled graph

• Nodes may be labeled by– constant numbers– normal function symbols (no side effect) functions

• Directed edges from functions to arguments(ordered according to argument position)

Congruence

• Two nodes in the value graph are congruent:– the nodes have identical function label– the corresponding destination of edges leaving

the nodes are congruent

• Tricky for value graphs with cycles

J J + 1

K K + 1

J 1

K 1

, I<29

NY

J J + 2

K K + 2

Loop Example

J2 J1 + 1

K2 K1 + 1

J0 1

K0 1

J1 2(J0, J4)

K1 2(K0, K4)

, I<29

NY

1

2

J3 J1 + 2

K3 K1 + 2

J4 5(J2, J3)

K4 5(K2, K3)

Loop Example(SSA)

3 4

5

Loop Example(Value Graph)

3

J0 1K0 1

J1 2( J0, J4)K1 2( K0, K4)I < 29

J2 J1 + 1K2 K1 + 1

J4 5( J2, J3)K4 5( K3, K3)

Y N

4 J3 J1 + 1K3 K1 + 1

1

Z0

1

J0

5

J4

2

Z1

2

J1

+J3

+J2

1

Z2

1

K0

5

K4

2

Z3

2

K1

+K3

+K2

1

2

3

5

l

l

r

l

l r

r

r

l

l

r

l

l r

r

r

A Simple Partitioning Algorithm

• Place all the nodes with the same label in the same partition

• Repeat splitting partitions with two nodes having corresponding edges leading to nodes in different partitions

A Simple Partitioning AlgorithmPartition = {P | x, y P label(x) == label(y) }

while (change) do

change = false

for each P Partition do

for each fi do

if (x,yP.fi(x) != fi(y)) then

split P

change = true

fi

od

od

od

An (E log E) Partitioning Algorithm Aho, Hofcroft, Ulllman 1974

• Given:– A set S– A function f: S S– A partition of S into disjoints blocks ={B1, B2, …, Bp}

• Find the coerset (having the fewest blocks) partition ’={E1, E2, …, Eq} such that

– ’ is a refinement of (’)

– a, b El f(a), f(b) Ej

WAITING { 1, 2, .. p }

q p

while WAITING != do

select and delete an integer I from WAITING

for m from 1 to k do

INVERSE

for x in B[i] INVERSE INVERSE f-1m(x) end

for each j such that B[j] INVERSE != and

not (B[j] INVERSE) do

q q + 1

create a new block B[q]

B[q] B[j] INVERSE

B[j] B[j] – B[q]

if j is in WAITING then add q to WAITING

else if B[j] <= B[q] then add j to WAITING

else add j to WAITING fi

od

od

od

Generalizations

• Identify control flow structures and generate special functions

• Handle arrays with ACCESS, UPDATE

• But can we find all the Kildall’s expressions in O (E log E)

I<29

J1 1

K1 1

J2 2

K2 2

J3 4 (J1, J2)

K3 4 (K1, K2)

, I<29

Y N

L1 1 L2 2Y N

1

2 3

4

5 6

7 L3 7(L1, L2)

1J1

2J2

4

1K1

2K2

4

1J1

2J2

7

I 29

<

I 29

<

J3K3l

l

l l

lrr

r

rr

Conditional Constant Propagation

• Conditions with constant values can be interpreted to improve precision

• A more precise solution is obtained “optimistically”

char * Red = “red”;

char * Yellow = “yellow”;

char * Orange = “orange”;

main()

{ FRUIT snack;

VARIETY t1; SHAPE t2; COLOR t3;

t1 = APPLE;

t2 = ROUND;

switch (t1) {

case APPLE: t3= Red;

break;

case BANANA: t3=Yellow;

break;

case ORANGE: t3=Orange; }}

printf(“%s\n”, t3 );}

main()

{ printf(“%s\n”, “red”);}

“red”




main()

{ FRUIT snack;


t1 = APPLE;

t2 = ROUND;

switch (t1) {


break;


break;


printf(“%s\n”, t3);}

Iterative Data-Flow AlgorithmInput: a flow graph G=(N,E,r) An init value Init A montonic function FB for every B in N

Output: For every N in(N)Initializatio: in(Entry) := Init;

for each node B in N-{Entry} do in(B) := WL := N - {Entry}Iteration: while WL != {} do Select and remove an B from WL out := FB(in(B)) For all B’ in succ(B) such that in(B’) != in(B’) out do in(B’):= in(B’) out WL := WL {B’}

Iterative Data-Flow AlgorithmInput: a flow graph G=(N,E,r) An init value Init A montonic function FB for every B in N

Output: For every N in(N)Initializatio: in(Entry) := Init;

for each node B in N-{Entry} do in(B) := WL := {Entry}Iteration: while WL != {} do Select and remove an B from WL out := FB(in(B)) For all B’ in succ(B) such that in(B’) != in(B’) out do in(B’):= in(B’) out WL := WL {B’}




main()

{ FRUIT snack;


t1 = APPLE;

t2 = ROUND;

switch (t1) {


break;


break;


printf(“%s\n”, t3);}

Conditional Constant Propagation• initialize the worklist to the entry node• mark all edges as not executable• repeat until the worklist is empty:

– select and remove a node from the worklist

– if it is an assignment then mark the successor edge as executable

– if it is a test then symbolically evaluate the test and mark the enabled successor edges as executable

• if test evaluates to true or mark true edge executable

• if test evaluates to false or mark false edge executable

– update the value of all the variables at the entry and exit of this node

– if there are changes then add all successors reachable from the node with edges marked executable to the worklist

Sparse Conditional Constant

• bring the program in SSA form• initialize the analysis information:

– all variables are mapped to – all flow edges are marked as not executable

• initialize the two worklists– Flow-Worklist contains all edges of the flow graph with

the entry node as source

– SSA-Worklist is empty

• repeat until both worklists are empty:– select and remove an edge from one of the worklists– if it is a flow edge then

• if the edge is not marked executable then – mark it executable

– if the target of the edge is a -node then call visit-– if it is the first time the node is visited (only one incoming flow edge is

marked executable) and it is a normal node then call visit-instr

– if it is an SSA edge then• if the target of the edge is a -node then call visit-• if it is a normal node and at least one of the flow edges entering the node

are marked executable then call visit-instr

• visit-: (the node is a -node)

– the assigned variable is given a value that is the join the values of the arguments with incoming edges marked executable

• visit-instr: (the node is a normal node)

– determine the value of the expression of the node and update the variable in case of an assignment

– if there are changes then • if the node is an assignment then add all SSA edges with source

at the target of the current edge to the SSA-worklist

• if the node is a test then add all relevant flow edges to the Flow-worklist and mark them executable

– if test evaluates to true or add true edge

– if test evaluates to false or : add false edge

Early Global Program Optimizations Chapter 12.4-6 Mooly Sagiv.

Documents

Transcript of Early Global Program Optimizations Chapter 12.4-6 Mooly Sagiv.