Dataflow II: Finish Dataflow Analysis, Start on Classical Optimizations

Click here to load reader

  • date post

    02-Jan-2016
  • Category

    Documents

  • view

    44
  • download

    0

Embed Size (px)

description

Dataflow II: Finish Dataflow Analysis, Start on Classical Optimizations. EECS 483 – Lecture 24 University of Michigan Wednesday, November 29, 2006. Announcements and Reading. Project 3 – should have started work on this Schedule for the rest of the semester Today – Dataflow analysis - PowerPoint PPT Presentation

Transcript of Dataflow II: Finish Dataflow Analysis, Start on Classical Optimizations

  • Dataflow II:Finish Dataflow Analysis, Start on Classical OptimizationsEECS 483 Lecture 24University of MichiganWednesday, November 29, 2006

    - * -

    Announcements and ReadingProject 3 should have started work on thisSchedule for the rest of the semesterToday Dataflow analysisWednes 11/29 Finish dataflow, optimizationsMon 12/4 Optimizations, start on register allocationWednes 12/6 Register allocation, Exam 2 reviewMon 12/11 Exam 2 in classWednes 12/13 No class (Project 3 due)Reading for todays class10.5, 10.6. 10.10, 10.11

    - * -

    Class Problem From Last Time1: r1 = 32: r2 = r33: r3 = r44: r1 = r1 + 15: r7 = r1 * r26: r2 = 07: r2 = r2 + 18: r4 = r2 + r19: r9 = r4 + r8Reaching definitions Calculate GEN/KILL Calculate IN/OUTGEN = 1,2,3KILL = 4,6,7GEN = 4,5KILL = 1GEN = 7KILL = 2,6GEN = 8KILL = GEN = 9KILL = GEN = 6KILL = 2,7IN = IN = 1,2,3 1,2,3,4,5,6,7,8 IN = 2,3,4,5 2,3,4,5,6,7,8 IN = 3,4,5,6,7,8 IN = 3,4,5,6,7 3,4,5,6,7,8 IN = 2,3,4,5 2,3,4,5,6,7,8 OUT = 1,2,3 OUT = 2,3,4,5 2,3,4,5,6,7,8 OUT = 3,4,5,7 3,4,5,7,8OUT = 3,4,5,6,7,8 3,4,5,6,7,8OUT = 3,4,5,6,7,8,9 OUT = 3,4,5,6 3,4,5,6,8

    - * -

    Some Things to Think AboutLiveness and reaching defs are basically the same thing!!!!!!!!!!!!!!!!!!All dataflow is basically the same with a few parametersMeaning of gen/kill (use/def)Backward / ForwardAll paths / some paths (must/may)So far, we have looked at may analysis algorithmsHow do you adjust to do must algorithms?Dataflow can be slowHow to implement it efficiently? (Block traversal order can speed things up)How to represent the info? (Bitvectors)

    - * -

    Generalizing Dataflow AnalysisTransfer functionHow information is changed by something (BB)OUT = GEN + (IN KILL) forward analysisIN = GEN + (OUT KILL) backward analysisMeet functionHow information from multiple paths is combinedIN = Union(OUT(predecessors)) forward analysis OUT = Union(IN(successors)) backward analysisNote, this is only for any path

    - * -

    Generalized Dataflow Algorithmwhile (change)change = falsefor each BBapply meet functionapply transfer functionif any changes change = true

    - * -

    Liveness Using GEN/KILLLiveness = upward exposed uses

    for each basic block in the procedure, X, do up_use_GEN(X) = 0 up_use_KILL(X) = 0 for each operation in reverse sequential order in X, op, do for each destination operand of op, dest, do up_use_GEN(X) -= dest up_use_KILL(X) += dest endfor for each source operand of op, src, do up_use_GEN(X) += src up_use_KILL(X) -= src endfor endforendfor

    - * -

    Example - Liveness with GEN/KILLr1 = MEM[r2+0]r2 = r2 + 1r3 = r1 * r4r1 = r1 + 5r3 = r5 r1r7 = r3 * 2r2 = 0r7 = 23r1 = 4r3 = r3 + r7r1 = r3 r8r3 = r1 * 2up_use_GEN(4.1) = r1 up_use_KILL(4.1) = r3 up_use_GEN(4.2) = r3,r8 up_use_GEN(4.3) = r3,r7,r8 up_use_KILL(4.2) = r1 up_use_KILL(4.3) = r1 up_use_KILL(3) = r1, r2, r7 up_use_KILL(1) = r1,r3 up_use_KILL(2) = r3,r7 up_use_GEN(3) = 0 up_use_GEN(1) = r2,r4 up_use_GEN(2) = r1,r5 meet: OUT = Union(IN(succs))xfer: IN = GEN + (OUT KILL)BB1BB2BB3BB4

    - * -

    Beyond Liveness (Upward Exposed Uses)Upward exposed defsIN = GEN + (OUT KILL)OUT = Union(IN(successors))Walk ops reverse orderGEN += dest; KILL += destDownward exposed usesIN = Union(OUT(predecessors))OUT = GEN + (IN-KILL)Walk ops forward orderGEN += src; KILL -= src;GEN -= dest; KILL += dest;

    Downward exposed defsIN = Union(OUT(predecessors))OUT = GEN + (IN-KILL)Walk ops forward orderGEN += dest; KILL += dest;

    - * -

    What About All Path Problems?Up to this pointAny path problems (maybe relations)Definition reaches along some pathSome sequence of branches in which def reachesLots of defs of the same variable may reach a pointUse of Union operator in meet functionAll-path: Definition guaranteed to reachRegardless of sequence of branches taken, def reachesCan always count on thisOnly 1 def can be guaranteed to reachAvailability (as opposed to reaching)Available definitionsAvailable expressions (could also have reaching expressions, but not that useful)

    - * -

    Reaching vs Available Definitions1: r1 = r2 + r32: r6 = r4 r5

    3: r4 = 44: r6 = 85: r6 = r2 + r36: r7 = r4 r51,2,3,4 reach1 available1,2 reach1,2 available1,3,4 reach1,3,4 available1,2 reach1,2 available

    - * -

    Available Definition Analysis (Adefs)A definition d is available at a point p if along all paths from d to p, d is not killedRemember, a definition of a variable is killed between 2 points when there is another definition of that variable along the pathr1 = r2 + r3 kills previous definitions of r1AlgorithmForward dataflow analysis as propagation occurs from defs downwardsUse the Intersect function as the meet operator to guarantee the all-path requirementGEN/KILL/IN/OUT similar to reaching defsInitialization of IN/OUT is the tricky part

    - * -

    Compute Adef GEN/KILL Setsfor each basic block in the procedure, X, do GEN(X) = 0 KILL(X) = 0 for each operation in sequential order in X, op, do for each destination operand of op, dest, do G = op K = {all ops which define dest op} GEN(X) = G + (GEN(X) K) KILL(X) = K + (KILL(X) G) endfor endforendforExactly the same as reaching defs !!!!!!!

    - * -

    Compute Adef IN/OUT SetsU = universal set of all operations in the ProcedureIN(0) = 0OUT(0) = GEN(0)for each basic block in procedure, W, (W != 0), do IN(W) = 0 OUT(W) = U KILL(W)

    change = 1while (change) do change = 0 for each basic block in procedure, X, do old_OUT = OUT(X) IN(X) = Intersect(OUT(Y)) for all predecessors Y of X OUT(X) = GEN(X) + (IN(X) KILL(X)) if (old_OUT != OUT(X)) then change = 1 endif endforendfor

    - * -

    Available Expression Analysis (Aexprs)An expression is a RHS of an operationr2 = r3 + r4, r3+r4 is an expressionAn expression e is available at a point p if along all paths from e to p, e is not killedAn expression is killed between 2 points when one of its source operands are redefinedr1 = r2 + r3 kills all expressions involving r1AlgorithmForward dataflow analysisUse the Intersect function as the meet operator to guarantee the all-path requirementLooks exactly like adefs, except GEN/KILL/IN/OUT are the RHSs of operations rather than the LHSs

    - * -

    Class Problem - Aexprs Calculation1: r1 = r6 * r92: r2 = r2 + 13: r5 = r3 * r44: r1 = r2 + 15: r3 = r3 * r46: r8 = r3 * 27: r7 = r3 * r48: r1 = r1 + 59: r7 = r1 - 610: r8 = r2 + 111: r1 = r3 * r412: r3 = r6 * r9Compute the Aexpr IN/OUTsets for each BB

    - * -

    Optimization Put Dataflow To Work!Make the code run faster on the target processorAnything goesLook at benchmark kernels, whats the bottleneck??Invent your own optisClasses of optimization1. Classical (machine independent)Reducing operation count (redundancy elimination)Simplifying operations2. Machine specificPeephole optimizationsTake advantage of specialized hardware features3. ILP enhancingIncreasing parallelismPossibly increase instructions

    - * -

    Types of Classical OptimizationsOperation-level 1 operation in isolationConstant folding, strength reductionDead code elimination (global, but 1 op at a time)Local Pairs of operations in same BBMay or may not use dataflow analysisGlobal Again pairs of operationsBut, operations in different BBsDataflow analysis necessary hereLoop Body of a loop

    - * -

    CaveatTraditional compiler classFancy implementations of optimizations, efficient algorithmsBla bla blaSpend entire class on 1 optimizationFor this class Go over concepts of each optimizationWhat it isWhen can it be applied (set of conditions that must be satisfied)

    - * -

    Constant FoldingSimplify operation based on values of src operandsConstant propagation creates opportunities for thisAll constant operandsEvaluate the op, replace with a mover1 = 3 * 4 r1 = 12r1 = 3 / 0 ??? Dont evaluate excepting ops !, what about FP?Evaluate conditional branch, replace with BRU or noopif (1 < 2) goto BB2 BRU BB2if (1 > 2) goto BB2 convert to a noopAlgebraic identitiesr1 = r2 + 0, r2 0, r2 | 0, r2 ^ 0, r2 > 0 r1 = r2r1 = 0 * r2, 0 / r2, 0 & r2 r1 = 0r1 = r2 * 1, r2 / 1 r1 = r2

    - * -

    Strength ReductionReplace expensive ops with cheaper onesConstant propagation creates opportunities for thisPower of 2 constantsMpy by power of 2: r1 = r2 * 8 r1 = r2 > 2Rem by power of 2: r1 = r2 REM 16 r1 = r2 & 15More exoticReplace multiply by constant by sequence of shift and adds/subsr1 = r2 * 6r100 = r2