Compiler Lecs End Sem Set5

download Compiler Lecs End Sem Set5

of 76

Transcript of Compiler Lecs End Sem Set5

  • 8/18/2019 Compiler Lecs End Sem Set5

    1/76

    Dataflow Analysis

    Simple optimizations over a basic block may be extended overthe entire CFGExample: Global Constant Propagation

    22-Apr-15 1CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    2/76

    Dataflow Analysis

    To replace a use of x by a constant k we must know:

    On every path to the use of x, the last assignment to x is x = k

    22-Apr-15 2CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    3/76

    Global Optimization

    The correctness condition is not trivial to check Paths include paths around loops and through branches of

    conditionalsGenerally global optimization depends on knowing a property X at aparticular point in program execution

    22-Apr-15 3CSE346:Compilers, IIT Guwahati

    It is OK to be conservative. If the optimization requires X to be true, thenwant to know either

    X is definitely trueDon’t know if X is trueIt is always safe to say “don’t know”

  • 8/18/2019 Compiler Lecs End Sem Set5

    4/76

    Global Constant Propagation

    We must know whether:On every path to the use of x, the last assignment to x is x = k

    We associate one of the following values with x at every program point: ┴ (Bottom) : This statement never executesC : x equals to constant C

    22-Apr-15 4CSE346:Compilers, IIT Guwahati

    ┬ (Top) : x is not a constant

    For each statement s, the value of x immediately before and after x iscalculated

    C(s, x, in): Value of x before sC(s, x, out): Value of x after s

  • 8/18/2019 Compiler Lecs End Sem Set5

    5/76

    Global Constant Propagation

    Rule #1

    22-Apr-15 5CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    6/76

    Global Constant Propagation

    • Rule #2

    22-Apr-15 6CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    7/76

    Global Constant Propagation

    • Rule #3

    22-Apr-15 7CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    8/76

    Global Constant Propagation

    • Rule #4

    22-Apr-15 8CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    9/76

    Global Constant Propagation

    • Rule #5

    22-Apr-15 9CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    10/76

    Global Constant Propagation

    • Rule #6

    22-Apr-15 10CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    11/76

    Global Constant Propagation

    • Rule #7

    22-Apr-15 11CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    12/76

    Global Constant Propagation

    • Rule #8

    22-Apr-15 12CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    13/76

    Global Constant Propagation

    For every entry s to the program, set C(s, x, in) = ┬ Set C(s, x, in) = C(s, x, out) = ┴ everywhere elseRepeat until all points satisfy 1-8:

    Pick s not satisfying 1-8 and update using the appropriate rule

    22-Apr-15 13CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    14/76

    Analysis of Loops

    How can global constant propagation for the following CFG beperformed?

    22-Apr-15 14CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    15/76

    Liveness Analysis

    The first value of x is dead (never used)The second value of x is live (may be used)

    A variable x is live at statement s if

    22-Apr-15 15CSE346:Compilers, IIT Guwahati

    There exists a statement s’ that uses xThere is a path from s to s’That path has no intervening assignment to x

    A statement x = … is dead code if x is dead after the assignmentDead statements can be deleted from the program

  • 8/18/2019 Compiler Lecs End Sem Set5

    16/76

    Liveness Analysis

    Liveness can be expressed in terms of information transferredbetween adjacent statements, just as in copy propagation

    Rule #1

    22-Apr-15 16CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    17/76

    Liveness Analysis

    Rule #2

    22-Apr-15 17CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    18/76

    Liveness Analysis

    Rule #3

    22-Apr-15 18CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    19/76

    Liveness Analysis

    Rule #4

    22-Apr-15 19CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    20/76

    Liveness Analysis

    Let all L(…) = false initiallyRepeat until all statements s satisfy rules 1-4

    Pick s where one of 1-4 does not hold and update usingthe appropriate rule

    22-Apr-15 20CSE346:Compilers, IIT Guwahati

    x = 0

    if (x == 10)

    x= x + 1x is dead

  • 8/18/2019 Compiler Lecs End Sem Set5

    21/76

    Register Allocation

    The process of assigning a large number of target programvariables onto a small number of CPU registers

    Method: Assign multiple temporaries to each register without changingthe program behavior

    22-Apr-15 21CSE346:Compilers, IIT Guwahati

    a := c + de := a + b

    f := e - 1

    r1 := r 2 + r 3r1 := r 1 + r 4

    r1 := r 1 - 1a , e and f can all be allocated tothe same register assuming aand e are dead after use

  • 8/18/2019 Compiler Lecs End Sem Set5

    22/76

    Register Allocation via Graph Colouring

    Basic Idea: Two t emporary variables t 1 and t 2 can share thesame register if at any point in the program at most one of t 1 or

    t 2 is live.

    22-Apr-15 22CSE346:Compilers, IIT Guwahati

    The variables alivesimultaneously at

    each program point

  • 8/18/2019 Compiler Lecs End Sem Set5

    23/76

    Register Allocation via Graph Colouring

    Register Interference Graph (RIG)An undirected graph

    Each temporary variable is a nodeTwo temporary variables t 1 and t 2 share an edge if theyare simultaneously alive at some program point

    22-Apr-15 23CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    24/76

    Register Allocation via Graph Colouring

    Graph Colouring: A n assignment of colours to nodes, such thatnodes connected by an edge have different colors

    k-coloring : A coloring using at most k colours.Chromatic number: The smallest number of colours needed to colour agraphIndependent set: A subset of vertices assigned to the same colour

    22-Apr-15 24CSE346:Compilers, IIT Guwahati

    k -coloring is the same as a partition of the vertex set into k independentsets

    The terms k-partite and k-colourable are equivalent

    The graph colouring problem is NP-Hard Heuristics needed to solve it

  • 8/18/2019 Compiler Lecs End Sem Set5

    25/76

    Register Allocation via Graph Colouring

    In the register allocation problem, colours = registersWe need to assign colours (registers) to graph nodes (temporaries)Let k = number of machine registersIf the RIG is k-colourable then there is a register assignment that uses nomore than k registers

    22-Apr-15 25CSE346:Compilers, IIT Guwahati

    In our example RIG thereis no coloring with less

    than 4 colours

  • 8/18/2019 Compiler Lecs End Sem Set5

    26/76

    Register Allocation via Graph Colouring

    Under the colouring, the code becomes:

    22-Apr-15 26CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    27/76

    Register Allocation via Graph Colouring

    A heuristic algorithm:

    Pick a node t with fewer than k neighborsPut t on a stack and remove it from the RIGRepeat until the graph is empty

    22-Apr-15 27CSE346:Compilers, IIT Guwahati

    Assign colors to nodes on the stack:Start with the last node addedAt each step pick a color different from those assigned toalready colored neighbors

  • 8/18/2019 Compiler Lecs End Sem Set5

    28/76

    Register Allocation via Graph Colouring

    • Example: Let k = 4•

    Initial RIG:Stack: {}

    22-Apr-15 28CSE346:Compilers, IIT Guwahati

    Remove a

  • 8/18/2019 Compiler Lecs End Sem Set5

    29/76

    Register Allocation via Graph Colouring

    Step 2:Stack: {a}

    22-Apr-15 29CSE346:Compilers, IIT Guwahati

    Remove d

  • 8/18/2019 Compiler Lecs End Sem Set5

    30/76

    Register Allocation via Graph Colouring

    • All nodes now have fewer than 4 nodes•

    Step 3:Stack: { d , a}

    22-Apr-15 30CSE346:Compilers, IIT Guwahati

    • any no e• Continue removing nodes until

    the graph is empty• Let the stack be: { f , e, b, c, d ,

    a} after removal of all nodes

  • 8/18/2019 Compiler Lecs End Sem Set5

    31/76

    Register Allocation via Graph Colouring

    • Now start assigning colours to the nodes, starting from the top

    of the stack• Stack: { f , e, b, c, d , a}• Stack: { e, b, c, d , a}

    22-Apr-15 31CSE346:Compilers, IIT Guwahati

    • Stack: { b, c, d , a}

    • Stack: { c, d , a}

  • 8/18/2019 Compiler Lecs End Sem Set5

    32/76

    Register Allocation via Graph Colouring

    • Stack: { d , a}

    • Stack: { a}(d and b can have

    22-Apr-15 32CSE346:Compilers, IIT Guwahati

    the same register)

    • Stack: {} }( a and e can havethe same register)

  • 8/18/2019 Compiler Lecs End Sem Set5

    33/76

    What if the heuristic fails?

    Example: Try to do a 3-colouring of the graph:

    22-Apr-15 33CSE346:Compilers, IIT Guwahati

    Remove a andget stuck

  • 8/18/2019 Compiler Lecs End Sem Set5

    34/76

    What if the heuristic fails?

    Pick a node as a candidate for spilling A spilled temporary lives in memory

    Assume we choose f as a candidate for spilling

    22-Apr-15 34CSE346:Compilers, IIT Guwahati

    • The algorithm now succeeds: b, d , e, c

    Remove f andcontinue

  • 8/18/2019 Compiler Lecs End Sem Set5

    35/76

    What if the heuristic fails?

    On the assignment phase we get to the point when we have to assign acolor to f

    We hope that among the 4 neighbors of f we use less than 3 colors ⇒optimistic coloring

    22-Apr-15 35CSE346:Compilers, IIT Guwahati

    • The algorithm now succeeds: b, d , e, c

    Remove f andcontinue

  • 8/18/2019 Compiler Lecs End Sem Set5

    36/76

    Spilling

    Since optimistic coloring failed we must spill temporary f

    We must allocate a memory location as the home of f Typically this is in the current stack frame

    Call this address fa

    22-Apr-15 36CSE346:Compilers, IIT Guwahati

    Before each operation that uses f , insert

    f := load fa

    After each operation that defines f , insert

    store f , fa

  • 8/18/2019 Compiler Lecs End Sem Set5

    37/76

    Spilling

    • Original code:

    22-Apr-15 37CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    38/76

    Spilling

    • Code after spilling:

    22-Apr-15 38CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    39/76

    Spilling

    • Re-compute Liveness:

    22-Apr-15 39CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    40/76

    Spilling

    • The new liveness information is almost as before•

    f i is live only• Between a f i:= load fa and the next instruction• Between a store f i, fa and the preceding instruction

    22-Apr-15 40CSE346:Compilers, IIT Guwahati

    • Spilling reduces the live range of f and thus reduces its interferences• Which result in fewer neighbors in RIG for f

  • 8/18/2019 Compiler Lecs End Sem Set5

    41/76

    Spilling

    With the new liveness information, we need to rebuild the RIG

    And try to colour the resulting graph again

    22-Apr-15 41CSE346:Compilers, IIT Guwahati

    Now f only interfaces with c and d

    The new RIG is 3-colourable

  • 8/18/2019 Compiler Lecs End Sem Set5

    42/76

    Spilling

    Additional spills might be required before a coloring is found

    The tricky part is deciding what to spillBut any choice is correct

    22-Apr-15 42CSE346:Compilers, IIT Guwahati

    Possible heuristics:Spill temporaries with most conflicts

    Spill temporaries with few definitions and uses

    Avoid spilling in inner loops

  • 8/18/2019 Compiler Lecs End Sem Set5

    43/76

    Automatic Parallelization

    Automatic conversion of sequential programs to parallel programs bya compiler

    Target may be a vector processor (vectorization), a multi-coreprocessors (concurrentization), or a cluster of loosely-coupleddistributed memory processors (parallelization).

    22-Apr-15 43CSE346:Compilers, IIT Guwahati

    Parallelism extraction process is normally a source-to-sourcetransformation process

    Requires dependence analysis to determine dependence betweenstatements

    Extracting available parallelism is not always possible.

  • 8/18/2019 Compiler Lecs End Sem Set5

    44/76

    Automatic Parallelization: Example 1

    22-Apr-15 44CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    45/76

    Automatic Parallelization: Example 1

    22-Apr-15 45CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    46/76

    Automatic Parallelization: Example 1

    22-Apr-15 46CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    47/76

    Dependences : Example 1

    22-Apr-15 47CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    48/76

    Iteration Vector

    DefinitionGiven a nest of n loops, the iteration vector i of a particular iterationof the innermost loop is a vector of integers

    i = ( i1, i2, …, in)where ik represents the iteration number for the loop at nesting level k

    48

    The set of all possible iteration vectors is an iteration space

  • 8/18/2019 Compiler Lecs End Sem Set5

    49/76

    Iteration Vector Example

    The iteration space of thestatement at S 1 is {(1,1),(2,1), (2,2), (3,1), (3,2),

    DO I = 1, 3DO J = 1, I

    S 1 A(I,J) = A(J,I)ENDDO

    49

    ,

    At iteration i = (2,1) thevalue of A(1,2) isassigned to A(2,1)

    (1,1)

    (2,1) (2,2)

    (3,2) (3,3)(3,1)

    I

    J

  • 8/18/2019 Compiler Lecs End Sem Set5

    50/76

    Iteration Vector Ordering

    The iteration vectors are naturally ordered according to alexicographical order , e.g. iteration (1,2) precedes (2,1) and (2,2) in theexample on the previous slide

    Definition

    50

    terat on prece es terat on , enote < ,1) i[1:n-1] < j[1:n-1], or2) i[1:n-1] = j[1:n-1] and in < jn

  • 8/18/2019 Compiler Lecs End Sem Set5

    51/76

    Loop Dependence

    DefinitionThere exist a dependence from S 1 to S 2 in a loop nest iff there existtwo iteration vectors i and j such that

    1) i < j and there is a path from S 1 to S 2

    51

    2) S 1 accesses memory location M on iteration i and S 2 accessesmemory location M on iteration j

    3) one of these accesses is a write

  • 8/18/2019 Compiler Lecs End Sem Set5

    52/76

    Loop Dependence Example

    Show that the example loop hasno loop dependencesAnswer: there are no iterationvectors i and j in {(1,1), (2,1),

    DO I = 1, 3DO J = 1, I

    S 1 A(I,J) = A(J,I)ENDDO

    52

    (2,2), (3,1), (3,2), (3,3)} such thati < j and either S 1 in i writes to thesame element of A that is read atS 1 in iteration j, or S 1 in iteration ireads an element A that is writtenin iteration j

    L R d i T f i

  • 8/18/2019 Compiler Lecs End Sem Set5

    53/76

    Loop Reordering Transformations

    Definitions

    A reordering transformation is any program transformation thatchanges the execution order of the code, without adding or deletingany statement executions

    53

    A reordering transformation preserves a dependence if it preservesthe relative execution order of the source and sink of thatdependence

    F d l Th f D d

  • 8/18/2019 Compiler Lecs End Sem Set5

    54/76

    Fundamental Theorem of Dependence

    Any reordering transformation that preserves everydependence in a program preserves the meaning of

    54

    V lid T f ti

  • 8/18/2019 Compiler Lecs End Sem Set5

    55/76

    Valid Transformations

    A transformation is said to be valid for the program to which it appliesif it preserves all dependences in the program

    55

    DO I = 1, 3S 1 A(I+1) = B(I)S 2 B(I+1) = A(I)

    ENDDO

    DO I = 1, 3B(I+1) = A(I)

    A(I+1) = B(I)ENDDO

    DO I = 3, 1, -1 A(I+1) = B(I)

    B(I+1) = A(I)ENDDO

    valid invalid

    D d d L T f ti

  • 8/18/2019 Compiler Lecs End Sem Set5

    56/76

    Dependences and Loop Transformations

    Loop dependences are tested before a transformation is appliedWhen a dependence test is inconclusive, dependence must be assumed

    In this example the value of K is unknown and loop dependence isassumed:

    56

    ,

    S 1 A(I+K) = A(I) + B(I)ENDDO

    Dependence Distance Vector

  • 8/18/2019 Compiler Lecs End Sem Set5

    57/76

    Dependence Distance Vector

    DefinitionSuppose that there is a dependence from S 1 on iteration i to S 2 oniteration j; then the dependence distance vector d (i, j) is defined as

    d (i, j) = j - iExample:

    57

    DO I = 1, 3DO J = 1, IS 1 A(I+1,J) = A(I,J)

    ENDDOENDDO

    True dependence between S 1 and itself oni = (1,1) and j = (2,1): d (i, j) = (1,0)i = (2,1) and j = (3,1): d (i, j) = (1,0)i = (2,2) and j = (3,2): d (i, j) = (1,0)

    Dependence Direction Vector

  • 8/18/2019 Compiler Lecs End Sem Set5

    58/76

    Dependence Direction Vector

    DefinitionSuppose that there is a dependence from S 1 on iteration i and S 2 oniteration j; then the dependence direction vector D(i, j) is defined as

    “ ”

    58

    D(i, j)k =

    ,

    “=” if d (i, j)k = 0“>” if d (i, j)k < 0

    Di ti V t E l 1

  • 8/18/2019 Compiler Lecs End Sem Set5

    59/76

    Direction Vectors: Example 1

    22-Apr-15 59CSE346:Compilers, IIT Guwahati

    Direction Vectors: Example 2

  • 8/18/2019 Compiler Lecs End Sem Set5

    60/76

    p

    22-Apr-15 60CSE346:Compilers, IIT Guwahati

    Direction Vectors: Example 3

  • 8/18/2019 Compiler Lecs End Sem Set5

    61/76

    p

    22-Apr-15 61CSE346:Compilers, IIT Guwahati

    Direction Vectors: Example 4

  • 8/18/2019 Compiler Lecs End Sem Set5

    62/76

    22-Apr-15 62CSE346:Compilers, IIT Guwahati

    Data Dependence Graphs and Vectorization

  • 8/18/2019 Compiler Lecs End Sem Set5

    63/76

    Data Dependence Graphs and Vectorization

    Data Dependence Graph (DDG) : encodes the flow of data.

    Node: program statementEdge: connects two nodes if one uses the result of the otherUseful in examining the legality of program transformations

    22-Apr-15 63CSE346:Compilers, IIT Guwahati

    Vectorization possible if DDG is acyclicVector code can be generated using topological sort order

    Otherwise, find all strongly connected components (SSCs), make

    DDG acyclic by treating each SSC as a single node.SSCs cannot be fully vectorized; resultant code will contain somesequential loops and possibly some vector code.

    Vectorization: Example 1

  • 8/18/2019 Compiler Lecs End Sem Set5

    64/76

    22-Apr-15 64CSE346:Compilers, IIT Guwahati

    Vectorization: Example 2

  • 8/18/2019 Compiler Lecs End Sem Set5

    65/76

    22-Apr-15 65CSE346:Compilers, IIT Guwahati

    Vectorization: Example 2

  • 8/18/2019 Compiler Lecs End Sem Set5

    66/76

    22-Apr-15 66CSE346:Compilers, IIT Guwahati

    Example 3

  • 8/18/2019 Compiler Lecs End Sem Set5

    67/76

    22-Apr-15 67CSE346:Compilers, IIT Guwahati

    Example 3

  • 8/18/2019 Compiler Lecs End Sem Set5

    68/76

    22-Apr-15 68CSE346:Compilers, IIT Guwahati

  • 8/18/2019 Compiler Lecs End Sem Set5

    69/76

    Loop-Independent Dependences

  • 8/18/2019 Compiler Lecs End Sem Set5

    70/76

    • Definition– Statement S 1 has a loop-independent dependence on S 2 iff there exist two iteration vectors i

    and j such that

    1) S 1 refers to memory location M on iteration i, S 2 refers to M on iteration j, and i = j

    70

    2) there is a control flow path from S 1

    to S 2

    within the iteration

  • 8/18/2019 Compiler Lecs End Sem Set5

    71/76

  • 8/18/2019 Compiler Lecs End Sem Set5

    72/76

    Example (1)

    DO I = 1, 3

    • The loop-carried dependenceS 1 δ (

  • 8/18/2019 Compiler Lecs End Sem Set5

    73/76

    Example (2)

    DO I = 1, 10DO J = 1, 10

    • All loop-carried dependencesare of level 3 because D i =

    73

    DO K = 1, 10

    S 1 A(I,J,K+1) = A(I,J,K)ENDDOENDDO

    ENDDO

    S 1 δ (=,=,

  • 8/18/2019 Compiler Lecs End Sem Set5

    74/76

    (1)

    • Let T be a transformation on a loop nest that does not rearrange thestatements in the body of the innermost loop; then T is valid if, after it isapplied, none of the direction vectors for dependences with source andsink in the nest has a leftmost non-“=” component that is “>”

    74

    Direction Vectors and Reordering Transformations(2)

  • 8/18/2019 Compiler Lecs End Sem Set5

    75/76

    (2)

    • A reordering transformation preserves all level- k dependences if

    1) it preserves the iteration order of the level- k loop,

    2 if it does not interchan e an loo at level < k to a osition

    75

    inside the level- k loop, and

    3) if it does not interchange any loop at level > k to a positionoutside the level- k loop

    Examples

  • 8/18/2019 Compiler Lecs End Sem Set5

    76/76

    DO I = 1, 10DO J = 1, 10

    DO K = 1, 10S 1 A(I+1,J+2,K+3) = A(I,J,K) + B

    ENDDOENDDO

    ENDDO

    DO I = 1, 10S 1 F(I+1) = A(I)

    S 2 A(I+1) = F(I)ENDDO

    Find deps.

    76

    Both S 1 δ (