1 Imperative Languages Summary Code Generation COMP 640 Programming Languages.
-
Upload
collin-jennings -
Category
Documents
-
view
235 -
download
0
Transcript of 1 Imperative Languages Summary Code Generation COMP 640 Programming Languages.
1
• Imperative Languages Summary• Code Generation
COMP 640Programming Languages
COMP 640 – CWB 2
Topics Imperative Languages Finish-up
Common features of imperative languages that are not in Jay
Concrete syntax changes Abstract syntax changes Type validity check changes
Code Generation Direct generation Intermediate Code Optimization
COMP 640 – CWB 3
Jay Data types: int and boolean Statements: assignment, if-else,
while, and blocks NO: functions, reals, strings, arrays,
I/O Turing complete Concrete syntax (See T&N,
Appendix B.1 and B.2)
COMP 640 – CWB 4
What's Missing from Jay?
Most imperative (and OO) languages include Floating point For statements (indexed and/or
iterator) Do-until statements Switch statements Break and continue statements Methods with parameters
COMP 640 – CWB 5
Adding Floating Point to Jay
Concrete syntax New types Overloaded arithmetic
operators Floating point literals Casts
Abstract syntax New types New arithmetic
operators Conversion operators
Type validity changes Allow for promotion
Source: T&N, Figure 4.3
COMP 640 – CWB 6
For statements (C-style)
Concrete: ForStatement for ( Assignment1Opt ;
ExpressionOpt ; Assignment2Opt ) Statement
Abstract:
Source: T&N, Section 4.5.1
COMP 640 – CWB 7
Adding Procedures to Jay
Concrete syntax:
Program package Id { Declarations Methods MainMethod }
Methods ε | Methods MethodMethod Type Id ( Parametersopt) { Declarations
Statements}Type int | boolean | voidParameters Parameter | Parameters, ParameterParameter Type IdMainMethod void main ( ) { Declarations Statements }
COMP 640 – CWB 8
Type Validity Checking
Every method and global variable has a unique identifier.
The set of local variables and parameters of a method have mutually unique identifiers and valid types.
Statements and expressions in a method are valid with respect to the variables that are accessible.
Return appears in non-void methods and the type of its expression matches the method return type.
Every call has a name matching the ID of a method and has the same number of arguments as the method has parameters.
Every argument in a call has the same type as the corresponding parameter of the method being called.
New Abstract Syntax Rules for Jay with Methods and GlobalsFigure 4.6
Abstract Syntax Sketch for a Jay Program with Globals and MethodsFigure 4.7
COMP 640 – CWB 11
Code Generation
Parse
TreeCode
Parse
TreeCode
Intermediate
Code
Tree
Machine
Code
Tree
Backend
COMP 640 – CWB 12
Parse Tree
Why a parse tree?
COMP 640 – CWB 13
Code for Expression:a*b + c*d
a, b, c are local variables d is a global variable
CISCmult a[$sp], b[$sp]
mov $0, $low
mult c[$sp], d[$gp]
mov $1, $low
add $0, $0, $1
RISCld$0, a[$sp]
ld $1, b[$sp]
mult $0, $1
mvlow $2
ld $0, c[$sp]
ld $1, d[$gp]
mult $0, $1
mvlow $1
add $0, $2, $1
Stackpush a[$sp]
push b[$sp]
mult
push c[$sp]
push d[$gp]
mult
add
COMP 640 – CWB 14
Simple (naïve) Recursive Generator Method, RISC
class Binary extends ITreeNode {Operator op;Expr e1, e2;
//generates asm code & returns // register number where result
is public int gen(PrintWriter out) {
int reg1= e1.gen(out);int reg2= e2.gen(out);return op.gen(reg1, reg2, out);
}}
How do the gen methods determine what registers to use?
Is this a issue for your project?
How would code vary for CISC? For Stack?
How to add optimization?
COMP 640 – CWB 15
Code for If-else statement:if(a<b) c= a; else c= b;
a & c are local variables
CISCbge a[$sp], b[$gp], F
store c[$sp], a[$sp]
br Z
F:
store c[$sp], b[$gp]
Z:
RISCld $0, a[$sp]
ld $1, b[$gp]
lt $2, $0, $1
bne $2, F
store $0, c[$sp]
br Z
F:
store $1, c[$sp]
Z:
Stackpush a[$sp]
push b[$gp]
brlt F
push a[$sp]
pop c[$sp]
br Z
F:
push b[$gp]
pop c[$sp]
Z:
b is a global variable
COMP 640 – CWB 16
Simple (naïve) Recursive Generator Method, RISC
class Binary extends ITreeNode {Operator op;Expr e1, e2;
//generates asm code & returns // register number where result
is public int gen(PrintWriter out) {
int reg1= e1.gen(out);int reg2= e2.gen(out);return op.gen(reg1, reg2, out);
}}
How do the gen methods determine what registers to use?
Is this a issue for your project?
How would code vary for CISC? For Stack?
How to add optimization?
COMP 640 – CWB 17
Optimization Scopes
Peephole (a "few" instructions) Local (basic block) Loop Global (within procedure) Inter-procedural
COMP 640 – CWB 18
Optimization Dependence/Independence
Programming language; examples: Loop concept common over many languages Pointers in a language make optimization
hard Machine
Avoiding wasted calculations is cross-machine (mostly)
Effect of locality depends on machine CISC vs. RISC vs. Stack differences
Categorizations in this and subsequent slides based on http://en.wikipedia.org/wiki/Compiler_optimization
COMP 640 – CWB 19
Machine-dependent Optimizations
Number of registers – later CISC – many instructions to pick from; RISC – heavy register use Avoiding pipeline stalls by instruction
scheduling Exploiting multiple CPUs through
parallelism Cache properties Memory bandwidth
COMP 640 – CWB 20
Optimization Themes
Re-use: store calculations for later re-use
Reduce code size Minimize jumps (optimize pre-
fetches) Code locality (cache optimizations)
COMP 640 – CWB 21
Loop Optimizations Analyze the "induction variable" (loop counter) Loop fission – improve locality of reference Loop fusion – avoid loop overhead Interchange inner and outer loop (locality) Loop reversal Loop unrolling – minimize jumps Loop splitting/peeling – divide up range to
avoid conditionals in loop Loop unswitching – remove conditional that is
fixed for any execution of the loop
COMP 640 – CWB 22
Data Flow Optimizations
Common sub-expression elimination
Constant folding (constants in expressions)
Constant propagation (analyze constants' effects on larger body of code)
COMP 640 – CWB 23
SSA – Single Static Assignment Optimizations
Transform code so every variable is assigned only once – re-assignment in original code causes new variable to be defined
Makes easier: Global Value Numbering – looks for
redundancies Symbolically execute code, propagating
constants. Then eliminate any dead code that is discovered.
a = c*d
e = c
f = e *d
COMP 640 – CWB 24
Backend Optimizations
Register allocation Instruction selection Instruction scheduling
COMP 640 – CWB 25
Strategy for Implementing Optimization
Convert parse tree into intermediate code tree
Correct logical conflicts (such as errors caused by side-effects) to generate canonical tree
Various optimizations on canonical tree (non-backend)
Backend optimization (machine dependent) Instruction selection Register allocation Instruction scheduling
COMP 640 – CWB 26
Intermediate Code
Code is relatively close to machine code, but machine-independent
Maintained in tree form for easy manipulation by optimization algorithms
Assumes unlimited number of registers
COMP 640 – CWB 27
Example Intermediate Code if(a < b)
c= a;else c= b;
cjump(Z)
LT(t, f)
lvar(a) gvar(b)seq
label(t) move jump(Z)
lvar(c)
lvar(a)
seq
label(f) move jump(Z)
lvar(c)
gvar(b)
COMP 640 – CWB 28
Side-effect Resolution Subroutine calls – typically fixed registers
are used for call and return value Problem implementing f(g(a),h(b)) Emit the code for g(a) and h(b) first, saving
the results Then emit code for f(t1, t2)
Move instructions (assignments to registers) Transform tree to "canonical form"
COMP 640 – CWB 29
Canonical Form Ordered list of trees (forest) Each tree is a single "statement"
• label(lab)
• jump(exp)
• cjump(test, lab, lab)
•move(exp, exp)
• exp
exp is a subtree:
• binop(op, exp, exp)
• addr(exp, int)
• temp
• name
• const(int)
• call(exp, expList)
COMP 640 – CWB 30
Minimize Jumps Define basic blocks
label code-without-jump jump Rearrange
Move false part of conditional immediately after the test code;
Move, to extent possible, a basic block with label Y so it follows a jump(Y)
Eliminate any jump(X) followed by label X
Includes conditional
jumps
COMP 640 – CWB 31
Instruction Selection Most important for CISC where there are
multiple possibilities for any function Define "tiles"
Small trees corresponding to each instruction in instruction set
Each node is an intermediate code node Many trees are single nodes, esp. in RISC
Find optimum tiling of the intermediate code tree
COMP 640 – CWB 32
Instruction Selection, cont.
Finding truly optimum tiling is NP-complete
Dynamic programming (cost minimization)
Maximum munch Start at top of intermediate code tree Find largest tile that matches Recurse on remaining subtrees
COMP 640 – CWB 33
Register Allocation Given a finite number of registers, k,
Minimize memory access Minimize re-calculation of results
We do our best to keep values in registers
When that is not possible, we "spill" registers to memory Store after calculation of value Load before use
COMP 640 – CWB 34
Algorithm Use SSA (Single Static Assignment) – each
assignment to a register value uses a new variable ("pseudo register")
Form graph of interfering variables Remove nodes with < k neighbors from the
graph and push them onto stack If graph becomes empty (yea!)
Pop nodes from stack and assign register If not, select node(s) for spilling
Insert spill code (causes new variables) Renumber the variables Repeat
COMP 640 – CWB 35
Exampleload $0, aload $1, badd $2, $0, $1store $2, cload $3, dadd $4, $1, $3add $5, $2, $3sub $6, $4, $5
0 1 2 3 4 5 6$0
$1
$2
$3 $4
$5
$6
k = 3
COMP 640 – CWB 36
Example, confoundedload $7, eload $0, aload $1, badd $2, $0, $1store $2, cload $3, dadd $4, $1, $3store $7, fadd $5, $2, $3sub $6, $4, $5
0 1 2 3 4 5 6 7 $0
$1
$2
$3 $4
$5
$6
k = 3
$7
COMP 640 – CWB 37
Example, Fixed by Spilling $4
load $7, eload $0, aload $1, badd $2, $0, $1store $2, tmp1store $2, cload $3, dadd $4, $1, $3store $7, fload $8, tmp1add $5, $8, $3sub $6, $4, $5
0 1 2 3 4 5 6 7 8$0
$1
$2
$3 $4
$5
$8
k = 3
$7
$6
COMP 640 – CWB 38
Next Week
Declarative (logic) languages Read Chapter 9 of T&N