Enhancing the Role of Inlining in Effective Interprocedural Parallelization

Post on 22-Feb-2016

29 views 0 download

Tags:

description

Enhancing the Role of Inlining in Effective Interprocedural Parallelization. Jichi Guo, Mike Stiles Qing Yi, Kleanthis Psarris. Problem. Inter-procedural parallelization Parallel after inlining Gain more parallelizable loops Lost of parallelized loops - PowerPoint PPT Presentation

Transcript of Enhancing the Role of Inlining in Effective Interprocedural Parallelization

Enhancing the Role of Inlining in

Effective Interprocedural Parallelization

Jichi Guo, Mike StilesQing Yi, Kleanthis Psarris

Problem• Inter-procedural parallelization

o Parallel after inlining• Gain more parallelizable loops• Lost of parallelized loops

o Inlining messes up caller / callee• Missed parallel opportunities

o Inlining increases code complexity

Goal• Keep the gain parallelizable loops• Prevent the lost parallelism• Discover the missed opportunities

Solution• Summarize the code using annotation

o Express the underlying information• Inline the annotation before parallelization

o Pass the summarized information to the compiler• Reverse-inline after parallelization

o Revert inlining side effectso Maintain equivalence

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

Problems of parallel + inlining

• Parallel + inliningo Conventional inlining with heuristics and pre-transformations

• Heuristics: code size• Transformations: linearization, forward substitution

o Intra-procedural loop parallelization• Fortran do-all loop

• Goalo Gain loops in caller

• Problemso Lost loops in caller / calleeo Missed loops in caller

Problems of parallel + inlining

• Lost of parallelizable loops in caller/calleeo Transformations that cause the lost

• Forward substitution• Linearization

• Forward substitution of non-linear subscriptso Create indirect array references

• Linearization of array dimensionso Mess up array shapes

Problems of parallel + inlining

• Forward substitution of non-linear subscriptso Create indirect array referencesX2(I) ⇒ T(IX(7) + I)Y2(I) ⇒ T(IX(8) + I)Z2(I) ⇒ T(IX(9) + I)

Problems of parallel + inlining

• Linearization of array dimensionso Mess up array shapesPP(i, j, k) ⇒ PP(i + j*4 + k*16)

Problems of parallel + inlining

• Missed parallelizable loops in callero Coding styles that cause the lost

• Opaque compositional subroutineso A calls B, B calls C, C calls D, …

• Array accesso When it is difficult to determine which part is killed

• Debugging and Error Checkingo Statement that breaks the dependency is never executed

• I/O statements• Indirect array references

o ID=IDX(I), X = A(ID)

Problems of parallel + inlining

• Opaque compositional subroutineso A calls B, B calls C, C calls D, …

Problems of parallel + inlining

• Array accesso Difficult to determine which part is killedCTR computed at runtime

Problems of parallel + inlining

• Debugging and Error Checkingo Statement that breaks the dependency is never executed

• I/O statements

Problems of parallel + inlining

• Indirect array referencesIN=>NODENODE=>IRELIREL=>RHSB

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

The annotation language

• Goalo Summarize informationo Avoid ambiguity

The annotation language

• Restricted grammar• Special operators• Writing annotations

The annotation language

• Restricted grammaro Do-all loop onlyo No goto

The annotation language

• Special operatorsy = operator(x1, x2, …, xn)Purpose: abstract relation

o Unknown operator• Relation is unknown

o Generic functionso Unique operator

• Relation is one-to-one, from X to Y

The annotation language

• Writing annotationso Eliminating adverse side effects

• Preserve caller and callee if inlining breaks the dependency o Summarize opaque subroutines

• Eliminate nested function callso Array access

• Specify exact range get read/modifiedo Debugging and error handling

• Aggressive strategy: ignore checking statementso Indirect array references

• Discover unique relation

The annotation language

• Summarize opaque subroutineso Eliminate nested function calls

The annotation language

• Array accesso Specify exact range get read/modified

The annotation language

• Debugging and error handlingo Aggressive strategy: ignore checking statements

The annotation language

• Indirect array referenceso Discover unique relation

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

Annotation-based inlining

• Goalo Pass annotated information to the compilero Eliminate inlining side effects

• Flowo Inline before parallelizationo Reverse-inlining after parallelizationo Verify and evaluate at last

• Implementationo POLARIS compiler for parallelizationo ROSE compiler for parsingo POET transformero PERFECT benchmark

Annotation-based inlining

• Workflowo Annotation inlining ⇒ Parallelization ⇒ Reverse-inlining

Annotation-based inlining

• Inlining annotationo Steps

• Annotation ⇒ source languageo Translating special operators

• Inlinining generated source languageo Avoiding linearization

o Translating special operators• Unknown: using uninitialized global arrays• Unique: using linear expression

o Avoiding linearization

Annotation-based inlining

• Inlining annotation

Annotation-based inlining

• Parallelize do-all loops

Annotation-based inlining

• Reverse inlining

Annotation-based inlining

• Reverse inlining is indispensibleo Inlinining is restored to function call

• Avoid lost of parallelism in caller / callee• Enable abstraction operators (unknown, unique)

Annotation-based inlining

• Verification and evaluationo Correctness, Efficiency, and Generality

Outline• Innovations• Problems of parallel + inline strategy• Annotation language• Annotation-based inlining technique• Experiments• Summary

Experiment• Purpose

o What does conventional lining bring to parallelization• Gain?• Lost?• Missed?

o How good is annotation-based inlining to avoid above issues• Design

o PERFECT benchmarks (except SPEC77)o Two machines

• 8 cores Intel Mac• 4 cores AMD Operon

o End compiler• GFortran 4.2.1• IFort 11.1

• Resulto Count of Loopso Performance

Experiment• Result: Loops

o Conventional inlining• Having loss

o Annotation-based inlining• No loss, more gain

Experiment• Result: Performance

o Average speeduplimited

o Annot-based inliningalways better

Summary• Inter-procedural parallelization• Summarize effects of conventional inlining

o Gaino Losto Missed

• Propose annotation-based inliningo Annotation summaryo Enhanced inlining strategyo Reverse inlining

Thanks!Questions?