06.Intermediate Code Generation
-
Upload
aregia-ovi -
Category
Documents
-
view
239 -
download
3
Transcript of 06.Intermediate Code Generation
Why Intermediate Code?
Front-end: generates intermediate representation
Back-end: generates target code
Why Intermediate code?
Retargeting is facilitated
A machine-independent code optimizer can be applied
Intermediate
Code Parser
Static Checker
Intermediate Code
Generator
Code Generator
Intermediate Languages Syntax trees, Postfix notation,
Three-address code (x := y op z)
Example: a := b * -c + b * -c
assign
+
*
a
b uniminus
c
*
b uniminus
c
Syntax tree:
assign
+
*
a
b uniminus
c
DAG:
Postfix notation: a b c uniminus * b c uniminus * + assign
Three-Address Code Sequence of statements:
x := y op z
x + y * z => t1 := y * z
t2 := x + t1
Example: a := b * -c + b * -c
t1 := - c
t2 := b *t1
t3 := - c
t4 := b * t3
t5 := t2 + t4
a := t5
Linearized representation of a syntax tree
t1 := - c
t2 := b *t1
t5 := t2 + t2
a := t5
Linearized representation of a syntax DAG
Types of Three-Address Statements Assignment statements: x := y op z and x := op y
Indexed assignments: x := y[ i ] and x[ i ] := y
Pointer assignments: x := & y, x := * y, * x := y
Copy statements: x := y
Unconditional jumps: goto L
Conditional jumps: if x relop y goto L
Function calls: param x, call p, n, and return y
p(x1,..., xn) => param x1
. . .
param xn
call p, n
Syntax-directed Translation into Three-Address Code Temporary names are created
E → E1 + E2 => t := E1 + E2,
e.g. t5 := t2 + t4
a := t5
When expression is an identifier – no new temporary
Nonterminal E has two attributes:
E.place – the name which holds the value of E
E.code – three-address code sequence
Syntax-directed Translation
Production Semantic Rule
S → id := E S.code := E.code || gen( id.place ‘:=‘ E.place )
E → E 1 + E2 E.place := newtemp(); E.code := E1.code || E2.code || gen( E.place ‘:=’ E1.place ‘+’ E2.place )
E → E1 * E2 E.place := newtemp(); E.code := E1.code || E2.code || gen( E.place ‘:=’ E1.place ‘*’ E2 .place )
E → - E1 E.place := newtemp(); E.code := E1.code || gen( E.place ‘:=’ ‘uminus ’ E1.place )
E → ( E1 ) E.place := E1.place E.code := E.code
E → id E.place := id.place E .code :=‘’
Function newtemp() returns distinct names (t1, t2…)
gen(x ‘:=‘ y ‘op’ z) => x := y op z
To produce three-address code for assignments
Flow-of-control Statements
Production Semantic Rule
S → while E do S1 S.begin := newlabel() S.after := newlabel() S.code := gen( S.begin ‘:’ ) || E.code || gen ( ‘if ’ E.place ‘=‘ ‘0’ ‘goto’ S.after ) || S1.code || gen ( ‘goto’ S.begin) || gen( S.after ‘:’ )
While statements
E.code
...
if E.place = 0 goto S.after
S1.code
...
goto S.begin
...
S.begin :
S.after :
Flow-of-control Statements Example
i := 2 * n + k
while
i := 2 * n + k
while i do
i := i - k
Source program fragment
t1 := 2
L1: if i = 0 goto L2
L2:
t1 := 2
t2 := t1 * n
t3 := t2 + k
i := t3
L1: if i = 0 goto L2
t4 := i - k
i := t4
goto L1
L2:
Three-address code sequence
Implementation … (1)
Quadruple is record: op, arg 1, arg 2, result
Quadruples, triples, and idirect triples
op arg 1 arg 2 result
(0) uniminus c t1
(1) * b t1 t2
(2) uniminus c t3
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 a
Pro: easy to rearrange code for global optimization Cons: lots of temporaries
t1 := - c
t2 := b *t1
t3 :=
t4 := b * t3
t5 := t2 + t4
a := t5
t1 := - c
t2 := b *t1
t3 := - c
t4 := b * t3
t5 := t2 + t4
a := t5
a := b * -c + b * -c
Three-address code Quads (quadruples)
Implementation … (2)
Triple is record: op, arg 1, arg 2
Quadruples, triples, and idirect triples
op arg 1 arg 2
(0) uniminus c
(1) * b (0)
(2) uniminus c
(3) * b (2)
(4) + (1) (3)
(5) := a (4)
Pro: temporaries are implicit Cons: difficult to rearrange code
t1 := - c
t2 := b *t1
t3 :=
t4 := b * t3
t5 := t2 + t4
a := t5
t1 := - c
t2 := b *t1
t3 := - c
t4 := b * t3
t5 := t2 + t4
a := t5
a := b * -c + b * -c
Three-address code Triples
Implementation … (3)
Implicit Triples: pointers to triples
Quadruples, triples, and idirect triples
op arg 1 arg 2
(14) uniminus c
(15) * b (14)
(16) uniminus c
(17) * b (16)
(18) + (15) (17)
(19) := a (18)
Pro: temporaries are implicit and it is easier to rearrange code
t1 := - c
t2 := b *t1
t3 :=
t4 := b * t3
t5 := t2 + t4
a := t5
t1 := - c
t2 := b *t1
t3 := - c
t4 := b * t3
t5 := t2 + t4
a := t5
a := b * -c + b * -c
Three-address code Implicit Triples Triple Container
statement
(0) (14)
(1) (15)
(2) (16)
(3) (17)
(4) (18)
(5) (19)
Names and Scope The three-address code is simplistic
it assumes that the names of variables can be easily resolved by the back end in global or local variables
We need local symbol tables to
record global and local declarations in procedures, blocks, and structs to resolve names
Declarations in a Procedure
P → M D
M → ε { offset := 0 }
D → D ; D
D → id : T { enter( id.name, T.type, offset); offset := offset + T.width }
T → integer { T.type := integer; T.width := 4 }
T → real { T.type := real; T.width := 8 }
T → array [ num ] of T1 { T.type := array( num.val, T1.type ); T.width := num.val x T1.width }
T → ^ T1 { T.type := pointer( T1.type ); T.width := 4 }
Keeping Track of Scope Information
Nested procedures
P → D
D → D ; D | id : T | proc id ; D ; S
D → proc id ; D1 ; S: then a new symbol table is created
The new table points to the enclosing procedure’s table
Procedure enter() ‘knows’ in which table to make an entry
Symbol Table Operations
mktable( previous ): returns a pointer to a new table which is linked to a previous table in the outer scope
enter( table, name, type, offset ): creates a new entry in table
addwidth( table, width ): accumulates the total width of all entries in table
enterproc( table, name, newtable ): creates a new entry in table for procedure with local scope newtable
Processing Declarations Nested Procedures
P → M D { addwidth( top( tblptr ), top( offset ) ); pop( tblptr ); pop( offset ) }
M → ε { t := mktable( nil ); push( t, tblptr ); push( 0, offset ) }
D → D1 ; D2
D → proc id ; N D1 ; S
{ t := top( tblptr ); addwidth( t, top( offset ) ); pop( tblptr ); pop( offset ); enterproc( top( tblptr ), id.name, t ) }
D → id : T { enter( top ( tblptr ), id.name, T.type, top( offset ) ); top( offset ) := top( offset ) + T.width }
N → ε { t := mktable( top( tblptr ) ); push( t, tblptr ); push( 0, offset ) }
Assignment Statements
S → id := E { p := lookup( id.name ); if p <> nil then emit( p ‘:=‘ E.place ) else error }
E → E1 + E2 { E.place := newtemp; emit( E.place ‘:=‘ E1.place ‘+’ E2.place ) }
E → E1 * E2 { E.place := newtemp; emit( E.place ‘:=‘ E1.place ‘*’ E2.place ) }
E → - E1
{ E.place := newtemp; emit( E.place ‘:=‘ ‘uniminus’ E1.place )}
E → ( E1 ) { E.place := E1.place }
E → id { p := lookup( id.name ); if p <> nil then E.place := p else error }
A Translation Scheme
Reusing Temporary Names
Each time newtemp() generates a new temporary name
Can we reuse temporaries?
E → E1 + E2 results in:
Modify newtemp() to use a “stack ”: Keep a counter c, initialized to 0
newtemp() increments c and returns temporary $c
Decrement counter on each use of a $i in a three-address statement
evaluate E1 into t1
evaluate E2 into t2
t3 := t1 + t2
evaluate E1 into t1
evaluate E2 into t2
t3 := t1 + t2
If t1 is no longer used, can reuse t1 instead of using new temp t3
Reusing Temporary Names
Statement Value of c
0
$0 := a * b 1
$1 := c * d 2
$0 := $0 + $1 1
$1 := e * f 2
$0 := $0 - $1 1
X := $0 0
Example: x := a * b + c * d – e * f
Addressing Array Elements
A : array [low .. high] of type;
ith element of A begins in location
base + (i – low) x W
base = A[low], W = sizeof( type )
Compile time evaluation if rewritten as
i x W + ( base – low x W )
c = base – low x W is stored in symbol-table entry for A
A[i] = i x W + c
One-Dimensional Arrays
t1 := c
t2 := i * W
t3 := t1[t2]
… := t3
t1 := c
t2 := i * W
t3 := t1[t2]
… := t3
Addressing Array Elements
In Row-major form A[i,j] is equivalent to A[i][j]
Two-Dimensional Arrays... (1)
A[1,1]
A[1,2]
A[1,3]
A[2,1]
A[2,2]
A[2,3]
A[1,1]
A[2,1]
A[1,2]
A[2,2]
A[1,3]
A[2,3]
Row-major Column-major
First Column
First Row
A : array [1..2, 1..3] of integer;
Addressing Array Elements
A[i,j]= base+((i–lowi)x Nj+j-lowj)x W
Nj = highj-lowj+1
((i x Nj) + j) x W + c, where
c = base - ((lowi x Nj) + lowj) x W
Two-Dimensional Arrays... (2)
A : array [1..2, 1..3] of integer; // Row-major form
t1 := i * 3
t1 := t1 + j
t2 := c
t3 := t1 * 4
t4 := t2[t3]
… := t4
t1 := i * 3
t1 := t1 + j
t2 := c
t3 := t1 * 4
t4 := t2[t3]
… := t4
Addressing Array Elements Grammar
Synthesized Attributes:
E.place name of temp holding value of E
Elist.array array name
Elist.place name of temp holding index value
Elist.ndim number of array dimensions
L.place lvalue (=name of temp)
L.offset index into array (=name of temp)
null indicates non-array simple id
S → L := E E L Elist
S → L := E E → E + E | ( E ) | L L → Elist ] | id Elist → Elist , E | id [ E
Addressing Array Elements A Translation Scheme ... (1)
S → L := E { if L.offset = null then /* L is a simple id */ emit( L.place ‘:=‘ E.place ) else emit( L.place’ [‘ L.offset ‘]’ ‘:=‘ E.place ) }
E → E1 + E2 { E.place := newtemp; emit( E.place ‘:=‘ E1.place ‘+’ E2.place ) }
E → ( E1 ) { E.place := E1.place }
E → L
{ if L.offset = null then /* L is a simple id */ E.place := L.place else begin E.place := newtemp; emit( E.place ‘:=‘L.place’ [‘ L.offset ‘]’ ) end }
Addressing Array Elements A Translation Scheme... (2)
L → Elist ] { L.place := newtemp; L.offset := newtemp emit( L.place ‘:=‘ c( Elist.array ) ) emit( L.offset ‘:=‘ Elist.place ‘*’ width( Elist.array )) }
L → id { L.place := id.place; L.offset := null }
Elist → Elist 1, E { t := newtemp; m := Elist 1.ndim +1; emit( t ‘:=‘ Elist 1.place ‘*’ limit( Elist 1.array, m )); emit( t ‘:=‘ t ‘+’ E.place ); Elist.array := Elist 1.array; Elist.place := t; Elist.ndim := m }
Elist → id [ E
{ Elist.array := id.place; Elist.place := E.place; Elist.ndim := 1 }
Translating Boolean Expressions
Composed by boolean operators and, or, not
Applied to boolean variables or relational expressions
Relational expression: E1 relop E2, where
E1 and E2 are arithmetic expressions
The grammar:
Attribute ‘op’: <, ≤, =, ≠, >, ≥ represented by relop
E → E or E | E and E | not E | ( E ) | id relop id | true | false E → E or E | E and E | not E | ( E ) | id relop id | true | false
Translating Boolean Expressions
True = 1, false = 0
Example:
a or b and not c
a < b: “if a<b then 1 else 0”
Numerical representation
t1 := not c
t2 := b and t1
t3 := a or t2
t1 := not c
t2 := b and t1
t3 := a or t2
100: if a < b goto 103
101: t := 0
102: goto 104
103: t := 1
104:
100: if a < b goto 103
101: t := 0
102: goto 104
103: t := 1
104:
Translating Boolean Expressions A Translation Scheme
E → E1 or E2 { E.place := newtemp; emit( E.place ‘:=‘ E1.place ‘or’ E2.place ) }
E → E1 and E2 { E.place := newtemp; emit( E.place ‘:=‘ E1.place ‘and’ E2.place ) }
E → not E1 { E.place := newtemp; emit( E.place ‘:=‘ ‘not’ E1.place ) }
E → ( E1 ) { E.place := E1.place }
E → id1 relop id2 { E.place := newtemp; emit( ‘if ’ id1.place relop.op id2.place ‘goto’ nextstat + 3 ); emit( E.place ‘:=‘ ‘0’ ); emit( ‘goto’ nextstat + 2 ); emit( E.place ‘:=‘ ‘1’ ) }
E → true { E.place := newtemp; emit( E.place ‘:=‘ ‘1’ ) }
E → false { E.place := newtemp; emit( E.place ‘:=‘ ‘0’ ) }
Backpatching
Code generation problem:
Labels (addresses) that control must go to may not be known at the time the jump statements are generated
A solution - backpatching:
Generate branching statements with empty jumps
Put the statements in a ‘goto’ list
Fill the labels of these statements when the proper label is determined
Backpatching Quadruples in a quadruple array
Labels are indexes into the array
Manipulate the list of lables:
makelist( i ) creates a new list containing three-address location i, returns a pointer to the list
merge( p1 , p2 ) concatenates lists pointed to by p1 and p2, returns a pointer to the concatenates list
backpatch( p, i ) inserts i as the target label for each of the statements in the list pointed to by p
Backpatching Grammar for boolean expressions
Synthesized Attributes:
E.code three-address code
E.truelist backpatch list for jumps on true
E.falselist backpatch list for jumps on false
M.quad location of current three-address quad
E → E1 or M E2 M E2 id1 relop id2 M
E → E1 or M E2 | E1 and M E2 | not E1 | ( E1 ) | id1 relop id2 | true | false M → ε
Backpatching A Translation Scheme... (1)
E → E1 or M E2 { backpatch( E1.falselist, M.quad ); E.truelist := merge( E1.truelist, E2.truelist ); E.falselist := E2.falselist }
E → E1 and M E2 { backpatch( E1.truelist, M.quad ); E.truelist := E2.truelist; E.falselist := merge( E1.falselist, E2.falselist ) }
E → not E1 { E.truelist := E1.falselist; E.falselist := E1.truelist }
E → ( E1 ) { E.truelist := E1.truelist; E.falselist := E1.falselist }
Backpatching A Translation Scheme... (2)
E → id1 relop id2 { E.truelist := makelist( nextquad ); E.falselist := makelist( nextquad + 1 ); emit( ‘if ’ id1.place relop.op id2.place ‘goto_’ ); emit( ‘goto_’ ) }
E → true { E.truelist := makelist( nextquad ); emit( ‘goto_’ ) }
E → false { E.falselist := makelist( nextquad ); emit( ‘goto_’ ) }
M → ε { M.quad := nextquad }
Backpatching Example: a<b or c<d and e<f
E → id1 relop id2
E → E1 or M E2
(M.quad := 102)
E → E1 and M E2
(M.quad := 104)
backpatch( {102}, 104 )
backpatch( {101}, 102 )
100: if a < b goto_
101: goto_
102: if c < d
103: goto_
104: if e < f
105: goto_
100: if a < b goto_
101: goto_
102: if c < d goto_
103: goto_
104: if e < f goto_
105: goto_
100: if a < b goto TRUE
101: goto 102
102: if c < d
103: goto FALSE
104: if e < f
105: goto FALSE
100: if a < b goto TRUE
101: goto 102
102: if c < d goto 104
103: goto FALSE
104: if e < f goto TRUE
105: goto FALSE
backpatch( {102}, 104 )
backpatch( {101}, 102 )
Flow-of-control Statements Grammar
Synthesized Attributes:
E.code three-address code
E.truelist backpatch list for jumps on true
E.falselist backpatch list for jumps on false
M.quad location of current three-address quad
S.nextlist backpatch list for jumps to the next statement after S (or nil)
L.nextlist backpatch list for jumps to the next statement after L (or nil)
S → if E then S L
S → if E then S | if E then S else S | while E do S | begin L end | A L → L ; S | S
Flow-of-Control Statements A Translation Scheme using backpatching... (1)
S → if E then M1 S1 N else M2 S2
{ backpatch( E.truelist, M1.quad ); backpatch( E.falselist, M2.quad ); S.nextlist := merge(S1.nextlist, merge(N.nextlist,S2.nextlist) }
N → ε { N.nextlist := makelist( nextquad ); emit( ‘goto_’) }
M → ε { M.quad := nextquad }
S → if E then M S1 { backpatch( E.truelist, M.quad ); S.nextlist := merge( E.falselist, S1.nextlist) }
S → wihle M1 E do M2 S1
{ backpatch( S1.nextlist, M1.quad ); backpatch( E.truelist, M2.quad ); S.nextlist := E.falselist; emit( ‘goto_’ M1.quad ) }
Flow-of-Control Statements A Translation Scheme using backpatching... (2)
S → begin L end { S.nextlist := L.nextlist }
S → A { S.nextlist := nil }
L → L1 ; M S { backpatch( L1.nextlist, M.quad ); L.nextlist := S.nextlist }
L → S { L.nextlist := S.nextlist }