Post on 15-Dec-2015
Overview
• Motivations• Basic static and dynamic optimization methods• ADAPT• Dynamo
Motivations for Dynamic Optimization
• Object Oriented Languages results in delayed binding, reduces scope optimizations
• DLLs limit static compile-time optimizations• Java JIT and dynamic binary translators are
impractical for heavyweight static compiler optimization
Motivations for Dynamic Optimization
• Computer system vendors are totally reliant on software vendors to enable optimizations to take advantage of their hardware
• Software is now commonly installed on a network file system server and run on machines of varying configurations
Some Traditional Dynamic Optimization Techniques
• Compile-Time Multiversioning– Multiple versions of code sections are generated at
compile-time
– Most appropriate variant is selected at runtime based upon characteristics of the input data and/or machine environment
– No runtime information can be exploited during code generation
– Multiple variants can cause code explosion• Thus typically only a few versions are created
Some Traditional Dynamic Optimization Techniques
Dynamic Feedback– Similar to Compile-Time Multiversioning
• Multiple versions generated at compile-time
• No runtime information can be exploited during code generation
• Only a few versions created to prevent code explosion
– Chooses variant by sampling• Measures execution times for variants and selects the fastest
Some Traditional Dynamic Optimization Techniques
• Dynamic Compilation– Generates new code variants during program
executions• Takes advantage of runtime information
– More overhead than the other methods
– To reduce overheads• Dynamic compilation is staged at compile-time
• Dynamic compilation only be applied to code sections that may benefit from it
ADAPT(Automated De-Coupled Adaptive Program Transformation)
Michael J. Voss and Rudolf EigenmannPurdue University
Overview of ADAPT
• ADAPT tries to combine the features of the other methods
• Uses a source-to-source compiler to perform optimizations
• Dynamic selection mechanism selects best code variant to run and does code generation (similar to JIT)
Intervals
• Optimization occurs at the granularity of intervals– Single entry, single exit
• Typically loop nests
• Source-to-source compiler replaces intervals with an if-else block that selects between a call to the Dynamic Selector and the default static version
Compiler Component
• ADAPT can use off-the-shelf compilers– Set different optimization flags or compilers to
produce different variants• Loop distribution
• Tiling
• Unrolling
• Automatic Parallelization
ADAPT Components
• The Inspector monitors the runtime environment– Timings– Each interval– Each optimized variant of the interval– Machine configuration
• Used to maintain and prioritize the Optimization Queue
ADAPT Components
• The Optimization Queue is a priority queue that orders interval descriptors by execution time– Used to minimize overheads by avoiding
insignificant intervals
• The Dynamic Selector chooses variants to run– Variants become “stale” after a period of time
and are removed
Sample Walkthrough
• Source-to-source compilation• Start with static version and run• Inspector tracks intervals for frequent use• Optimized variants of the frequently used intervals
and other runtime information are generated• When an interval takes sufficiently long time, the
Dynamic Selector is called and chooses the best variant– If the variant takes too long, go back to the static
version
Dynamo:A Transparent Dynamic
Optimization SystemHP Labs
Vasanth Bala, Evelyn Duesterwald,Sanjeev Banerjia
Overview of Dynamo
• Takes in binary instruction codes
• Optimizes the code dynamically without code annotations or binary means
• Transparent operation:– Accepts and optimizes legacy code– Runs like a hybrid user DLL and a virtual
machine
How Dynamo Starts
• Dynamo takes over, takes snapshot of registers and environment stack.
• Dynamo activates the “interpreter”– Intercepts and scans the native code from the
program like a filter
ProgramDynamo’sInterpreter
CPU
How Dynamo Works
FragmentCache
PotentialCode
Fragment
CodeFragment
Start trace
End trace
OptimizeAnd link with
Other codefragments
In cacheAlready?
yes
no
Code Fragments
• The interpreter can create and optimize code fragments
• Code fragments are code traces– Code trace starts when a certain piece of code is
executed many times– Program most likely to follow the same path
while tracing
Code Fragments
• Code fragments consist of:– Start which is the line of code after a taken
backward branch– End which is a backward taken branch or
another branch leading to another fragment
• Easy to optimize code fragment– One entrance, multiple exits– Requires one iteration of a backward and
forward data flow analysis
Optimizations of Fragments• Remove branches expressing fall-throughs only• Keeps conditional branches
A
B C
D
A
C
D
Optimizations include:Constant PropagationCopy PropagationLoop InvariantStrength ReductionBranch, load, assignmentredundancy
Linking Cached Fragments• Conditional branches and exits may lead to links to other
fragments in cache which speeds up Dynamo.• If no fragments exist, start another trace.
A
C
D
B
D
EF
G
true
false
Start tracing again
Cache Management
• New fragment entries require to create links between existing fragments
• Deletion of fragment requires removal of all links which is slow
• Cache may get filled up with fragments– Flush when a lot of new code is being traced– Means you have entered a new section of the
program
Performance
• Single PA-8000 processor
• SpecInt95 benchmarks, compiled with– O2, O4, O2+P, O4+P with/without Dynamo– O2 + Dynamo ran as well as O4 native– O4 + P ran as well with or without Dynamo
• Overhead of Dynamo was 1.5% of the execution time with SpecInt95.
Conclusion• ADAPT and Dynamo are
– Opposite approaches of• Internal representation, single-exit-single-entry or
single-entry-multiple-exits
• Dynamo use binary code and ADAPT uses source-to-source high level code
– Both use standard compilers, no special annotations, and utilize runtime info
• ADAPT can allow programmers to customize the selection of optimizations.