Syntax-Directed Translation and Intermediate Code Generation
PART 4 - SYNTAX DIRECTED · PDF filePART 4 - SYNTAX DIRECTED TRANSLATION F. Wotawa ......
Transcript of PART 4 - SYNTAX DIRECTED · PDF filePART 4 - SYNTAX DIRECTED TRANSLATION F. Wotawa ......
PART 4 - SYNTAX DIRECTED TRANSLATION
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 109 / 264
Setting
Translation of context-free languagesInformation↔ attributes of grammar symbolsValues of attributes are defined by “semantic rules”2 possibilities:
Syntax directed definitions (high-level spec)Translation schemes (implementation details)
Evaluation: (1) Parse input, (2) Generate parse tree, (3) Evaluateparse tree
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 110 / 264
Syntax directed definitions
Generalization of context-fee grammars
Each grammar symbol has a set of attributes
Synthesized vs. inherited attributes
Attribute: string, number, type, memory location, . . .
Value of attribute is defined by semantic rules
Synthesized: Value of child node in parse treeInherited: Value of parent node in parse tree
Semantic rules define dependencies between attributes
Dependency graph defines calculation order of semantic rules
Semantic rules can have side effects
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 111 / 264
Form of a syntax directed definition
Grammar production: A→ α
Associated semantic rule: b := f(c1, . . . , ck)
f is a functionSynthesized: b is a synthesized attribute of A and c1, . . . , ck aregrammar symbols of the productionInherited: b is an inherited attribute of a grammar symbol on theright side of the production and c1, . . . , ck are grammar symbols ofthe productionb depends on c1, . . . , ck
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 112 / 264
Example
“Calculator”-program: val is a synthesized attribute fornonterminals E, T and F
Production Semantic RuleL→ En print(E.val)E → E1+T E.val := E1.val + T.valE → T E.val := T.valT → T1*F T.val := T1.val ∗ F.valT → F T.val := F.valF → (E) F.val := E.valF → digit F.val := digit.lexval
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 113 / 264
S-attributed grammar
Attributed grammar exclusively using synthesized attributesExample-evaluation: 3*5+4n (annotated parse tree)
L
E.val=19 n
T.val=4
digit
F.val=4
.lexval=4
E.val=15+
T.val=15
*T.val=3
digit.lexval=3
F.val=3
F.val=5
digit.lexval=5
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 114 / 264
Inherited attributes
Definition of dependencies of program language constructs andtheir contextExample: (type checking)
Production Semantic RuleD → TL L.in := T.typeT → int T.type := integerT → real T.type := realL→ L1, id L1.in := L.in
addtype(id.entry, L.in)L→ id addtype(id.entry, L.in)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 115 / 264
Inherited attributes – Annotated parse tree
real id1, id2, id3
D
T.type=real
real
L.in=real
L.in=real
L.in=real
id
id
1
2
id3
,
,
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 116 / 264
Dependency graphs
Show dependencies between attributesEach rule is represented in the form b := f(c1, . . . , ck)
Nodes correspond to attributes; edges to dependenciesDefinition:for each node n in the parse tree do
for each attribute a of the grammar symbol at n doconstruct a node in the dependency graph for a
for each node n in the parse tree dofor each semantic rule b := f(c1, . . . , ck) associated
with the production used at n dofor i := 1 to k do
construct an edge from the node for ci to the node for b
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 117 / 264
Dependency graph – Example
D
real
id
id
1
2
id3
,
,entry
entry
entry
T L
L
Lin
in
intype
1
2
3
4
56
7
9
10
8
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 118 / 264
Topological sort
Arrangement of m1, . . . ,mk nodes in a directed, acyclic graph whereedges point from smaller nodes to bigger nodesIf mi → mj is an edge, then the node mi is smaller than the node mj
Important for order in which the attributes are calculated
Example (cont.):1 a4 := real2 a5 := a43 addtype(id3.entry, a5)4 a7 := a55 addtype(id2.entry, a7)6 a9 := a77 addtype(id1.entry, a9)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 119 / 264
Example - syntax trees
Abstract syntax tree = simplified form of a parse treeOperators and keywords are supplied to intermediate nodes byleaf nodesProductions with only one element can collapseExamples:
if-then-else
B S S1 2
3 5
4
+
*
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 120 / 264
Syntax trees – Expressions
Functions (return value: pointer to new node):
mknode(op, left, right): node label op, 2 child nodes left, rightmkleaf(id, entry): leaf id, entry in symbol table entrymkleaf(num, val): leaf num, value val
Syntax directed definition:Production Semantic RuleE → E1 + T E.nptr := mknode(′+′, E1.nptr, T.nptr)E → E1 − T E.nptr := mknode(′−′, E1.nptr, T.nptr)E → T E.nptr := T.nptrT → (E) T.nptr := E.nptrT → id T.nptr := mkleaf(id, id.entry)T → num T.nptr := mkleaf(num,num.val)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 121 / 264
Syntax trees – Expressions (ex.)
Syntax tree for a-4+cE nptr
T nptrE nptr
T nptr-
+
E
T nptr
id
num
id
id num
’-’
’+’
id
4
to entry for a
to entry for c
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 122 / 264
Evaluation of S-attributed definitions
Attributed definition exclusively using synthesized attributesEvaluation using bottom-up parser (LR-parser)Idea: store attribute information on stack
State Val. . . . . .X X.x
Y Y.y
top→ Z Z.z
. . . . . .
Semantic rule:A.a := f(X.x, Y.y, Z.z)Production: A→ XY ZBefore XY Z is reduced to A, valueof Z.z stored in val[top], Y.y storedin val[top− 1], X.x in val[top− 2]
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 123 / 264
Example - S-attributed evaluation
“Calculator”-example:
Production Code FragmentL→ En print(val[top− 1])E → E1 + T val[ntop] := val[top− 2] + val[top]E → TT → T1 ∗ F val[ntop] := val[top− 2] ∗ val[top]T → FF → (E) val[ntop] := val[top− 1]F → digit
Code executed before reduction
ntop = top− r + 1, after reduction: top := ntop
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 124 / 264
Result for 3*5+4n
Input state val Production used3*5+4n
*5+4n 3 3*5+4n F 3 F → digit*5+4n T 3 T → F5+4n T * 3
+4n T * 5 3 5+4n T * F 3 5 F → digit+4n T 15 T → T ∗ F+4n E 15 E → T
4n E + 15n E + 4 15 4n E + F 15 4 F → digitn E + T 15 4 T → Fn E 19 E → E + T
E n 19L 19 L→ En
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 125 / 264
L-attributed definitions
Definition: A syntax directed definition is L-attributed if each inheritedattribute of Xj , 1 ≤ j ≤ n, on the right side of A→ X1, . . . , Xn is onlydependent on:
1 the attributes X1, . . . , Xj−1 to the left of Xj and2 the inherited attributes of A
Each S-attributed grammar is a L-attributed grammar
Evaluation using depth-first orderprocedure dfvisit(n : node)
for each child m of n, from left to right doevaluate inherited attributes of mdfvisit(m)
endevaluate synthesized attributes of n
end
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 126 / 264
Translation schemes
Translation scheme = context-free language with attributes forgrammar symbols and semantic actions which are placed on theright side of a production between grammar symbols and areconfined within {}Example:
T → T1 ∗ F{T.val := T1.val ∗ F.val}If only synthesized attributes are used, the action is always placedat the end of the right side of a productionNote: Actions may not access attributes which are not calculatedyet (limits positions of semantic actions)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 127 / 264
Translation schemes (cont.)
If both inherited and synthesized attributes are used the followingneeds to be taken into consideration:
1 An inherited attribute of a symbol on the right side of a productionhas to be calculated in an action which is positioned to the left ofthe symbol
2 An action may not reference a synthesized attribute belonging to asymbol which is positioned to the right of the action
3 A synthesized attribute of a nonterminal on the left side can only becalculated if all referenced attributes have already been calculated⇒ actions like these are usually placed at the end of the right side
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 128 / 264
Example translation scheme
S → A1A2 {A1.in := 1;A2.in := 2}A→ a {print(A.in)}
Above grammar does not fulfill the three conditions for translationschemesThe inherited attribute A.in is not yet defined at the point in timewhen it should be printedBut: For each L-attributed grammar a translation scheme can befound which fulfills the three conditions, e.g.:
S → {A1.in := 1} A1 {A2.in := 2} A2
A→ a {print(A.in)}
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 129 / 264
Top-down translation
Removal of left recursions in translation scheme is necessary
Example:
E → E1 + T {E.val := E1.val + T.val}E → E1 − T {E.val := E1.val − T.val}
E → T {E.val := T.val}T → (E) {T.val := E.val}T → num {T.val := num.val}
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 130 / 264
Example top-down translation
E → T {R.i := T.val}R {E.val := R.s}
R→ +T {R1.i := R.i+ T.val}R1 {R.s := R1.s}
R→ −T {R1.i := R.i− T.val}R1 {R.s := R1.s}
R→ ε {R.s := R.i}T → (
E) {T.val := E.val}
T → num {T.val := num.val}
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 131 / 264
Evaluation of 9-5+2
E
-
T.val = 9 R.i = 9
T.val = 5 R.i = 4.val = 9num
.val = 5num + T.val = 2 R.i = 6
.val = 2num ε
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 132 / 264
Summary transformation
Given translation scheme:A→ A1Y {A.a := g(A1.a, Y.y)}A→ X {A.a := f(X.x)}
After removal of left recursions:A→ XRR→ Y R|ε
Transformed scheme:A→ X {R.i := f(X.x)}
R {A.a := R.s}R→ Y {R1.i := g(R.i, Y.y)}
R1 {R.s := R1.s}R→ ε {R.s := R.i}
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 133 / 264
Predictive parsing with schemesInput: syntax-directed translation scheme; Outp.: Syntax-directed translator
1 For each nonterminal A, construct a function that has a formal parameterfor each inherited attribute of A and that returns the values of thesynthesized attributes of A. This function has a local variable for eachattribute of each grammar symbol that appears in a production for A.
2 As previously described (see predictive parsing), the code fornonterminal A decides what production to use based on the currentinput symbol.
3 The code for each production does the following (evaluation from left toright):
1 Token X with synthesized attribute x: Save the value of x in avariable X.x. Generate a call to match token X.
2 Nonterminal B: Generate c := B(b1, . . . , bk); b1, . . . , bk variables forinherited attributes of B; c variable for synthesized attribute of B.
3 For an action, copy the code into the parser, replacing eachreference to an attribute by the variable for that attribute.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 134 / 264
Example - predictive parsing
Grammar:E → T {R.i := T.val}
R {E.val := R.s}R→ op
T {R1.i := mknode(op.lexeme,R.i, T.nptr)}R1 {R.s := R1.s}
R→ ε {R.s := R.i}T → (
E) {T.val := E.val}
T → num {T.val := num.val}
Functions:function E : nodefunction R(i : node) : nodefunction T : node
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 135 / 264
Parsing procedure R
Procedure without translation scheme
procedure R()begin
if lookahead = op then beginmatch(op);T ();return R()
end else beginreturn;
endend
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 136 / 264
Parsing function R
function R (i: node) : nodevar nptr, i1, s1, s: node; oplexeme : char;
beginif lookahead = op then begin
oplexeme := lexval;match(op);nptr := T ();i1 := mknode(oplexeme,i,nptr);s1 := R(i1);s := s1
end elses := i;
return send
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 137 / 264
Bottom-up with inherited attribute
Implementation of L-attributed grammars in bottom-up parsersFor LL(1)-grammars and many LR(1)-grammarsRemoval of embedding actions from translation schemes:
Actions have to be placed at end of right side of a productionEnsured by new marker nonterminals
Example:E → TRR→ +T{print(′+′)}R| − T{print(′−′)}R|εT → num{print(num.val)}
E → TRR→ +TMR| − TNR|εT → num{print(num.val)}M → ε{print(′+′)}N → ε{print(′−′)}
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 138 / 264
Inherited attributes on the stack
Idea: Production A→ XY , synthesized attribute X.x and inheritedattribute Y.y
Before a reduction (of X Y ), X.x is on the stackIn the case of Y.y = X.x (copy action), the value of X.x can beused whenever the value of Y.y is required
Example: Parser for variable declarations real p,q,r
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 139 / 264
Variable declaration - example
D → T {L.in := T.type}L
T → int {T.type := integer}T → real {T.type := real}L→ L1 {L1.in := L.in}
,id {addtype(id.entry, L.in)}
L→ id {addtype(id.entry, L.in)}
D
T
in
Lreal
Ltype
r,
L
in
in
p
q,
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 140 / 264
Calculation using the stack
Input state Production Usedreal p,q,r
p,q,r realp,q,r T T → real,q,r T p,q,r TL L→ idq,r TL ,
,r TL , q,r TL L→ L, idr TL ,
TL , rTL L→ L, idD D → TL
Implementation:Production Code FragmentD → TLT → int val[top] := integerT → real val[top] := realL→ L, id addtype(val[top], val[top− 3])L→ id addtype(val[top], val[top− 1])
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 141 / 264
Problems
Positions of attributes on the stack need to be known
Production Semantic RuleS → aAC C.i := A.sS → aABC C.i := A.sC → c C.s := g(C.i)
When the reduction C → cis conducted, it is unknownwhether the value of C.i islocated in val[top − 1] or inval[top − 2]! It depends onwhether a B is located on thestack.
Solution: Introduction of a marker M :S → aAC C.i := A.sS → aABMC M.i := A.s;C.i := M.sC → c C.s := g(C.i)M → ε M.s := M.i
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 142 / 264
Problems (cont.)
Simulation of semantic rules which are no copy actionsUsage of marker!
S → aAC C.i := f(A.s)S → aANC N.i := A.s;C.i := N.sN → ε N.s := f(N.i)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 143 / 264
Bottom-up parsing . . .
. . . with calculation of inherited attributesInput: L-attributed definition (and LL(1)-grammar)Output: Parser, which calculates attribute values on stack
1 Assumptions: Each nonterminal A has an inherited attribute A.i, eachgrammar symbol X has a synthesized attribute X.s. If X is a terminal,then X.s is the lexical value of X (supplied by the lexical analyser). Thevalues are stored on the stack in form of an array val.
2 For each production A→ X1 . . . Xn create n new markers (nonterminals)M1, . . . ,Mn and replace the production with A→M1X1 . . .MnXn.Note: synthesized values for Xi are stored in the val array entry, whichbelongs to Xi. Inherited values Xi.i are stored in entries which areassociated to Mi.
3 Invariant: The new inherited attribute A.i (if existing) is always directlybeneath the position of M1 within the val array.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 144 / 264
Simplifications
Reduction of markers:1 If Xj has no inherited attribute, then no marker Mj is required⇒ positions of attributes on the stack are shifting!
2 If X1.i exists and is calculated by X1.i = A.i, then M1 is notrequired
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 145 / 264
Removal of inherited attributes
Replacement of inherited attributes by synthesized onesNot always possibleRequires modification of grammar!Example: Declarations in PascalD → L : TT → integer|charL→ L, id|id
convert to:D → idLL→, idL| : TT → integer|char
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 146 / 264
Difficult syntax directed definition
The following definition cannot be processed by bottom-up parsersusing current approachesS → L L.count := 0L→ L11 L1.count := L.count+ 1L→ ε print(L.count)
Reason: L→ ε receives the number of 1s by means of inheritanceHowever, as L→ ε is used in the reduction first, no value is specifiedyet!
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 147 / 264
Recursive evaluators
Evaluation of attributesBased on parse treeNot possible in conjunction with parsingOrder of nodes which are visited during evaluation is arbitraryFor each nonterminal a translation function existsExtensions may visit nodes more than onceOrder of node visits needs to regard the following:
1 Each inherited attribute of a node has to be calculated before thenode is visited
2 Synthesized attributes are calculated before the node is left (for thelast time)
Order is determined by dependencies
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 148 / 264
Example – Recursive evaluators
Production Semantic RulesA→ LM L.i := l(A.i)
M.i := m(L.s)A.s := f(M.s)
A→ QR R.i := r(A.i)Q.i := q(R.s)A.s := f(Q.s)
A
L M
A
Q R
i s
i i i
i
ss s s
s
i
function A(n, ai)if production(n) = ′A→ LM ′ then
li := l(ai)ls := L(child(n, 1), li)mi := m(ls)ms := M(child(n, 2),mi)return f(ms)
if production(n) = ′A→ QR′ thenri := r(ai)rs := R(child(n, 2), ri)qi := q(rs)qs := Q(child(n, 1), qi)return f(qs)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 149 / 264
PART 5 - TYPE CHECKING
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 150 / 264
Static Program Checking
Type Checks Check of the used type. Error if operands areincompatible with the used operator. Example: 1.2 + 2(real + int).Flow-of-Control Checks Check if the transfer of the programexecution is possible. Example: break needs an enclosing loop.goto label needs a defined label.Uniqueness Checks Check if an object has been defined exactlyonce. Example: In Pascal each identifier must be unique.Name-related Checks In some languages, names (e.g. forprocedures) are used which need to occur at a different location(e.g. at the end of a procedure).
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 151 / 264
Tasks
Check if the type system of the language is satisfied.
Separate type checker is not always necessary.
parser typechecker
intermediatecode
generator
token stream parse tree syntax tree
representationintermediate
Typesystems (Examples):
“If both operands of the arithmetic operators of addition, subtractionand multiplication are of type integer, then the result is of typeinteger”“The result of the unary & operator is a pointer to the object referredto by the operand. If the type of the operand is ’...’, the type of theresult is ’pointer of ...’.”
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 152 / 264
Type Expressions
A type expression is:
1 a Basic Type integer, boolean, char, and real as well as a special BasicType type error or void.
2 the Type Name
3 a composite type in the form of:
1 Arrays. array(I, T ); set of indexes I, type T2 Products. T1 × T23 Records. record((N1 × T1)× . . .× (Nk × Tk)); name Ni, types Ti4 Pointers. pointer(T )5 Functions. T1 → T2
4 and type variables.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 153 / 264
Types – Examples
type row = recordaddress: integer;lexeme: array[1..15] of char
end;var table: array[1..101] of row;row can be represented asrecord((address× integer), (lexeme× array(1..15, char))).function f(a,b: char): ↑ integer;is represented as: char × char → pointer(integer).
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 154 / 264
Graphical Representation of Types
as DAG (Directed Acyclic Graph)→
× pointer
char integer
or as a tree→
× pointer
char char integer
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 155 / 264
Typesystems
Set of Rules
specified using attributed grammars (or verbally)
Static vs. Dynamic Checking of Types
Sound Typesystem = static type checking is sufficient
Language is strongly typed = the compiler guarantees that an acceptedprogram runs without type errors.
But some checks can only occur dynamically
table: array[0..255] of char;i: integer;
The correctness of the call table[i] in the program can not bechecked by the compiler.
Error Recovery is important (even for type errors)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 156 / 264
Type Checker Spec
Language Definition:
P → D ; ED → D ; D| id : TT → char | integer | array [ num ] of T | ↑ TE → literal | num | id |E mod E|E[E]|E ↑
Example:key: integer ;key mod 1999
array [256] of chararray(1 . . . 256, char)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 157 / 264
1. Secure Type Info
Production Semantic RuleP → D ; ED → D ; DD → id : T {addtype( id.entry, T.type)}T → char {T.type := char}T → integer {T.type := integer}T → array [ num ] of T1 {T.type := array(1 . . .num.val, T1.type)}T →↑ T1 {T.type := pointer(T1.type)}
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 158 / 264
2. Type Checking – Expressions
Production Semantic RuleE → literal E.type := charE → num E.type := integerE → id E.type := lookup( id.entry)
E → E1 mod E2 E.type :=
if E1.type = integer andE2.type = integer then integer
else type error
E → E1[E2] E.type :=
if E1.type = array(s, t) andE2.type = integer then t
else type error
E → E1 ↑ E.type :=
{if E1.type = pointer(t) then telse type error
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 159 / 264
3. Type Checking – Statements
Production Semantic Rule
S → id := E S.type :=
{if id.type = E.type then voidelse type error
S → if E then S1 S.type :=
{if E.type = boolean then S1.typeelse type error
S → while E do S1 S.type :=
{if E.type = boolean then S1.typeelse type error
S → S1 ; S2 S.type :=
if S1.type = void and
S2.type = void then voidelse type error
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 160 / 264
4. Type Checking – Functions
Syntax Extension: T → T ’→ ’ T DefinitionE → E ( E ) FunctionCall
Type Extraction + Type Checking:
Production Semantic RuleT → T1 ’→ ’ T2 T.type := T1.type→ T2.type
E → E1 ( E2 ) E.type :=
if E1.type = s→ t and
E2.type = s then telse type error
Example:root : ((real → real) × real) → real
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 161 / 264
Type Equivalence
When are types equivalent???
structural equivalencename equivalence
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 162 / 264
Structural Equivalence
(1) function sequiv(s, t) : boolean; begin(2) if s and t are the same basic type then(3) return true(4) else if s = array(s1, s2) and t = array(t1, t2) then(5) return s1 = t1 and sequiv(s2, t2)(6) else if s = s1 × s2 and t = t1 × t2 then(7) return sequiv(s1, t1) and sequiv(s2, t2)(8) else if s = pointer(s1) and t = pointer(t1) then(9) return sequiv(s1, t1)(10) else if s = s1 → s2 and t = t1 → t2 then(11) return sequiv(s1, t1) and sequiv(s2, t2)(12) else return false end
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 163 / 264
Encoding of Type Expressions
Expression as Bit vector (efficient storage and comparison)Example:
Type Constructor Encodingpointer 01array 10
freturns 11
Basic Type Encodingboolean 0000char 0001integer 0010real 0011
Type expression Encodingchar 00 00 00 0001
freturns(char) 00 00 11 0001pointer(freturns(char)) 00 01 11 0001
array(pointer(freturns(char))) 10 01 11 0001
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 164 / 264
Name vs. structural Equivalence
Example (Pascal Programm)type link = ↑ cell;var next: link;
last : link;p : ↑ cell;q,r : ↑ cell;
Do all variables have the same type?Depends on the typesystem (and the compiler in pascal!)Implementation of the above example creates implicit types (e.g.type np : ↑ cell for variable p).
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 165 / 264
Cyclic Typedefinition (Example)
type link = ↑ cell;cell = record
info: integer;next: link;
end;
cell = record
integerinfo next pointer
cell
cell = record
integerinfo next pointer
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 166 / 264
Type conversion / Coercions
Statement of the problem: x+i with x as a real- and i as an integervariable.
There exist only operators for (real + real) or (int + int)
Type conversion necessary! x = int2real(i)
Implicit (by the compiler) or explicit (by the programmer) possible
Implicit = Coercion
Loss of information should be prevented (int→ real but not real→ int).
Performance!!!for I := 1 to N do X[I] := int2real(1) (PASCAL; X is anarray of reals) needs 48,4 µsfor I := 1 to N do X[I] := 1.0 needs only 5,4 µs.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 167 / 264
Type conversion - Semantic Rules (1)
Production Semantic RuleE → id E.type := lookup(id.entry)
E.txt := id.entryE → E1 op E2 E.type := if E1.type = integer and E2.type = integer
then integerelse if E1.type = integer and E2.type = real
then realelse if E1.type = real and E2.type = integer
then realelse if E1.type = real and E2.type = real
then realelse type error
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 168 / 264
Type conversion - Semantic Rules (2)
Production Semantic RuleE → E1 op E2 E.txt := if E1.type = integer and E2.type = integer
then E1.txt ◦ E2.txtelse if E1.type = integer and E2.type = real
then int2real(E1.txt) ◦ E2.txtelse if E1.type = real and E2.type = integer
then E1.txt ◦ int2real(E2.txt)else if E1.type = real and E2.type = real
then E1.txt ◦ E2.txtelse type error
E → num E.type := integerE.txt := val
E → num.num E.type := realE.txt := val
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 169 / 264
Overloading
Symbols with different meaning (dependent on applicationcontext)
mathematics: + operator (integer, reals, complex numbers)ADA: ()-Expression for array access AND function calls
Overloading is resolved, when the meaning is clear (operatoridentification)Overloading can often be resolved by the types of operands.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 170 / 264
Overloading - Possible Types
Example (ADA):function "*"(i,j: integer) return complex;function "*"(i,j: complex) return complex;
Possible types for * are:
integer × integer → integerinteger × integer → complexcomplex× complex→ complex
Assumption: 2,3,5 are integer
3*5 is either integer or complex.So 2 * (3 * 5) must be of type integer.(3*5)*z is of type complex, if z is of type complex.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 171 / 264
Handling Overloading
Instead of a type the set of all possible types must be stored in anattribute.Attribute types!E′ → E E′.types = E.typesE → id E.types = {lookup( id.entry)}E → E1 ( E2 ) E.types = {t|∃s ∈ E2.types∧ s→ t ∈ E1.types}
Example: 3*5
i i i
i i c
c c c
E: {i,c}
E: {i} *: E: {i}
3: {i} 5: {i}
{
}
,
,
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 172 / 264
Uniqueness of Types
Expressions may only have one type (otherwise type error)Production Semantic RuleE′ → E E′.types := E.types
E.unique := if E′.types = {t} then t else type errorE → id E.types := {lookup( id.entry)}
E → E1 ( E2 )
E.types := {s′|∃s ∈ E2.types∧ (s→ s′) ∈ E1.types}t := E.uniqueS := {s|s ∈ E2.types∧ s→ t ∈ E1.types}
E2.unique := if S = {s} then s else type errorE1.unique := if S = {s} then s→ t else type error
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 173 / 264
Polymorphic Functions
Polymorphic Function = Function, whose argument may have a arbitrarytype
Polymorphic refers to functions and operators
Examples: Built-in operators for array-access, pointer manipulation
Reason for polymorphism:
Code can be used for various data structuresExample: finding the length of lists (e.g. ML)fun length(lptr) =
if null(lptr) then 0else length(tl(lptr))
length([sun,mon,tue]), length([1,2,3,4])not possible in PASCAL!
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 174 / 264
Type Variables
Variables, that allow us to talk about unknown types
Note: Type Variables as greek letters α, β, . . ..
Type Inference = problem of deciding the type of an expression takinginto account the application (of the expression).
Example
type link ↑ cell;procedure mlist ( lptr : link; procedure p)begin
while lptr <> nil do beginp(lptr);lptr := lptr↑.next
endend;
mlist: link × procedure →voidp: link → void
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 175 / 264
Example - Type Inference
Program:
function deref(p);begin
return p↑end;
Derivation:1 Type of p is β (Assumption)2 From p ↑ follows that p must be a pointer. Therefore it holds:β = pointer(α).
3 Furthermore, we know that the type of p↑ must be α.4 Therefore, it follows: ∀α : pointer(α)→ α is the type of the function
deref.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 176 / 264
Language for Polymorphism
Type expression of the form ∀α.E(α) denotes a ’polymorph type’.Language definition:P → D ; ED → D ; D| id : QQ→ ∀ type variable . Q|TT → T ′ →′ T |T × T |(T )
| unary constructor ( T )| basic type | type variable
E → E ( E ) |E , E| id
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 177 / 264
Example
deref : ∀α.pointer(α)→ α ;q : pointer(pointer(integer)) ;deref(deref(q))
deref0 : pointer(α0)→ α0
derefi : pointer(αi)→ αi q : pointer(pointer(integer))
apply : αi
apply : α0
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 178 / 264
Differences in Type Handling
In distinction from former type handling (without polymorphism):1 Arguments of polymorph functions in an expression may have
different types.2 The concept of type equivalence is different.pointer(α) = pointer(pointer(integer)) ???
3 Calculated Types must be used in further consequence. The effectof the unification of two expressions must be preserved.α is assigned the type t. If α is referenced elsewhere, t must beused!
Terms: Substitution, Instances, Unification
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 179 / 264
Substitution, Instances
Substitution = function that maps type variables to typeexpressions. S : type variables 7→ type expressions
Example: α 7→ pointer(integer)
Application of a substitution:function subst(t : type expression) : type expressionbegin
if t is a basic type then return telse if t is a variable then return S(t)else if t is t1 → t2 then return subst(t1)→ subst(t2)
endS(t) . . . Instance. We write s < t⇔ s is instance of t.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 180 / 264
Examples
Instances:pointer(integer) < pointer(α)
pointer(real) < pointer(α)integer → integer < α→ α
pointer(α) < βα < β
No Instances:integer real substitution on Basic Types not possibleinteger → real α→ α inconsistent replacement of αinteger → α α→ α all occurrences must be replaced
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 181 / 264
Unification
2 types t1, t2 are unifiable if there exists a substitution S, so thatS(t1) = S(t2) holds.In praxis, we are interested in the Most General Unifier (MGU).
1 S(t1) = S(t2)2 Every substitution S′ with S′(t1) = S′(t2) must be an instance of S.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 182 / 264
Checking Polymorphic Functions
2 Functions:1 fresh(t) replaces all Variables in the type expression t with new
variables. A pointer to the node representing the new expression isreturned.
2 unify(m,n) unifies the two expressions m and n. As a side effectthe substitution is performed.
Translation Schema:Production Semantic Rule
E → E1 ( E2 )p := mkleaf(newtypevar);unify(E1.type,mknode(
′→′, E2.type, p));E.type := p
E → E1 , E2 E.type := mknode(′×′, E1.type, E2.type)E → id E.type := fresh( id.type)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 183 / 264
Example Type Checking
deref0 : pointer(α0)→ α0
derefi : pointer(αi)→ αi q : pointer(pointer(integer))→ β
apply : αi
apply : α0
Summary (Bottom-up type detection):Expression : Type Substitution
q : pointer(pointer(integer))derefi : pointer(αi)→ αi
derefi(q) : pointer(integer) αi = pointer(integer)deref0 : pointer(α0)→ α0
deref0(derefi(q)) : integer α0 = integer
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 184 / 264
Unification Algorithm
Input. A graph and a pair of nodes m and n, which should be unified.Output. True, if the nodes can be unified, False otherwise.Method. A node is represented by the record [constructor, left, right, set],where set is the Set of equivalent nodes. A node of set is chosen asrepresentative of this set. In the beginning, each set contains only the nodeitself.
find(n) returns the representative node
union(m,n) merges the equivalence sets. The new representative node is anode which does not correspond with a variable. If there existsno such node, a former representative node is chosen as thenew one.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 185 / 264
Algorithm - Pseudocode
function unify(m,n : node) : boolean begins := find(m);t := find(n);if s = t then return trueelse if s and t are nodes that represent the same basic type then return trueelse if s is an op-node with children s1, s2 and
t is an op-node with children t1, t2 then beginunion(s, t);return unify(s1, t1) and unify(s2, t2) end
else if s or t represents a variable then beginunion(s, t)return true end
else return false end
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 186 / 264
Example – Unification
Type expression:((α1 → α2)× list(α3))→ list(α2)((α3 → α4)× list(α3))→ α5
..→ : 1.
× : 2
.
list : 8
.
→ : 3
.
α1 : 4
.
α2 : 5
.
list : 6
.
α3 : 7
.
→ : 11
.
α4 : 12
.
× : 10
. → : 9.
α5 : 14
.
list : 13
Question: unify(1, 9) =?
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 187 / 264
PART 6 - RUN-TIME ENVIRONMENT
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 188 / 264
Objectives/Tasks
Relate static source code to actions at program runtime.Names in the source code relate to (not necessarily the same)data objects on the target machine.Allocation and deallocation of data objects need to be managed(run-time support packages).procedure activationStore data objects accordingly to their data type.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 189 / 264
Definitions
Procedure definition = name + bodyProcedures with return value = functionsProgram = procedure (e.g. main)Procedure name in one location in the code = procedure callVariables (identifier) in procedure definition = formal parametersArguments of a procedure call = actual parameters (substituteformal parameters after call)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 190 / 264
Example Program
program sort(input,output)var a: array[0..10] of integer;procedure readarray;
var i: integer;begin .... end;
procedure partition(y,z: integer) : integer;var i,j,x,v : integer;begin .... end;
procedure quicksort(m,n : integer);var i: integer;begin
if (n>m) then begini := partition(m,n); quicksort(m,i-1); quicksort(i+1,n);
endend;
begina[0]:=-9999; a[10]:=9999;readarray; quicksort(1,9);
end.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 191 / 264
Activation Trees
Assumptions:1 Sequential control flow2 Procedure activation starts at the beginning of the body. After
finishing the procedure, the statement located after the procedurecall is executed.
Activation = execution of the bodyLife time of a procedure = step sequence (time) from the first tothe last step during the procedure execution.Recursive procedure calls are possible (do not have to occurdirectly) P → Q→ . . .→ P
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 192 / 264
Activation Trees / Definition
1 each node represents a procedure activation2 the root node represents the activation of the main program3 node a is the parent node of b↔ control flow: In a, b is called
(activated)4 node a is left of node b↔. The lifetime of a is ahead of the lifetime
of b
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 193 / 264
Example Activation Tree
q(9,9)q(7,7)p(7,9)
q(7,9)q(5,5)p(5,9)
q(3,3)q(2,1)p(2,3)
q(2,3)q(1,0)p(1,3)
q(5,9)q(1,3)p(1,9)
q(1,9)r
s
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 194 / 264
Control Stack
Control flow = Depth processing (left to right) of the activation treeControl stack = stack to save all procedures at their lifetimeBeispiel:
q(2,3)q(1,0)p(1,3)
q(1,3)p(1,9)
q(1,9)r
s
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 195 / 264
Declarations / Scopes
Explicit or implicitvar i : integer;
scope of the variable given through Scope Rulesvariables can be used within the scopeglobal vs. local variablesvariables with the same name may denote different objects (due totheir scope)sequence of variable access (first local, then global variable if thename is identical,...)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 196 / 264
Name Binding
data object: storage location, which can store a value
A variable (variable name) can reference different data objects duringruntime.
Program language semantics:
Environment: function that maps names to storage locationsState: function that maps the storage locations to values
Environment and State are different!!!!Example: pi is associated with the address 100, which stores the value0. After pi := 3.14 the storage location 100 has the value 3.14; pi,however, still points to 100.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 197 / 264
Binding
A variable x is bound to storage location s, if the storage locationis associated with x.Location does not always have to be a (real) storage location (inthe main memory of the computer); e.g. complex datatypesCorrelation between STATIC and DYNAMIC notations:
STATIC NOTATION DYNAMIC COUNTERPARTdef. of a procedure activation of a procedure
declaration of a name bindings of the namescope of a declaration lifetime of a binding
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 198 / 264
Important Questions
. . . regarding organization of memory management and name binding.
1 Are there recursive procedures?
2 What happens with the values of local variables after finishing theprocedure execution.
3 Can a procedure reference non-local variables?
4 How are parameters passed to a procedure?
5 Can procedures be passed as parameters?
6 Can procedures be returned as a return value?
7 Is there dynamic memory allocation?
8 Does memory have to be deallocated explicitly? (or does this happenimplicitly (garbage collection)?)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 199 / 264
Memory management
Run time memory for:1 generated target code2 data objects3 control stack for procedure activation
typical layout:
CodeStatic Data
Stack↓↑
Heap
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 200 / 264
Activation Record
. . . for information storage at a procedure call:1 temporary values (evaluation of expressions,..)2 local data (local variables,..)3 machine data to save (program counter, registers,..)4 access links (link to non-local data)5 control link (link to activation record of the called procedure)6 current parameters (usually stored in register)7 return value of the called procedure
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 201 / 264
Compile-Time Layout
Memory = blocks of Bytes, Byte = smallest addressable unit
usually: 1 Byte = 8 Bit; n Bytes = Word
Memory for a variable (or parameter) is dependant on the type.Example: Basic Types (int, real, boolean,..) = n Bytes
Storage Layout dependant on addressing:Example:
Aligned Integers may only reside at certain addresses (addresses that aredivisible by 4)
Padding 10 characters are necessary to save a string, but there must be 12Bytes allocated.
arrays, records are written to a memory range that is of sufficient size.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 202 / 264
Memory Allocation Strategies
1 static allocation (at compile time)2 stack allocation3 heap allocation
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 203 / 264
Static Allocation
Memory mapping is determined at compile time.Local values remain stored even after procedure termination.No run time support package necessary.Limitation:
1 Size of the data structures must be known at compile time.2 Recursive procedures are only possible with restrictions (all
recursive calls share the same memory!).3 Data structures can not be created dynamically.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 204 / 264
Stack Allocation
Idea:Control Stackprocedure is activated→ activation record is pushed to the stackprocedure activation terminates→ activation record is popped fromthe stack
local values are deleted after termination of the activation
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 205 / 264
Example
Activation Tree activation record Remarks
s sa : array
Frame for s
r
s sa : array
ri : integer
r is activated
q(1,9)r
s
sa : arrayq(1,9)
i : integer
Frame for r has be-en popped and q(1,9)pushed
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 206 / 264
Example...
Activation Tree activation record Remarks
q(1,0)p(1,3)
q(1,3)p(1,9)
q(1,9)r
s
sa : arrayq(1,9)
i : integerq(1,3)
i : integer
Control has just retur-ned to q(1,3)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 207 / 264
Calling/Return Sequences
Calling Sequence: allocate the activation records and fill in thefieldsReturn Sequence: recover machine stateCalling sequences do not necessarily equal activation record
1 caller evaluates the current parameters2 return address, stack top are stored in the activation record of the
calling procedure (callee).3 The callee stores the register values and other status information4 The callee initializes the local data and starts the execution.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 208 / 264
Sequences...
possible return sequence:1 the callee stores the return value2 the stack, register information,.. are restored3 the caller copies the return value into his activation record
task division:
responsibility
Callee’s
responsibility
Caller’s
record
activation
Callee’s
record
activation
Caller’s
control link
control link
temporaries and local data
links and saved status
parameters and return value
temporaries and local data
links and saved status
parameters and return value
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 209 / 264
Data with variable length
storage not directly in the activation recorda pointer to the data is stored
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 210 / 264
Dangling References
Reference to memory is used but memory has already been deallocated.
logical programming error
cause of mysterious bugs
Example:
main() {int *p;p = dangle(); }
int *dangle() {int i = 23;return &i; }
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 211 / 264
Heap Allocation
Necessary if:1 value of local variable needs to be retained (after activation)2 the called procedure survives the calling procedure
Memory is allocated and deallocated at requestMemory management is necessary:
1 linked list for storing free blocks2 free sections should be filled in an optimal way
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 212 / 264
Access to nonlocal names
lexical or static scope rules (declaration of names decided atcompile time)static scope with most closely nested scopesdynamic scope rules (declaration of names decided at run time;activations are considered)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 213 / 264
Blocks
most closely nested:1 the scope of a declaration in block B contains B.2 If the name x in B is not declared, an occurrence of x in B is in the
scope of a declaration of x in the surrounding block B′:1 x is declared in B′.2 B′ is the closest immediate surrounding block of B which declares x.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 214 / 264
Example – Scopes
main() {int a = 0; B0
int b = 0;{
int b = 1; B1
{int a = 2; B2
}{
int b = 3; B3
}}
}
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 215 / 264
Memory Handling
1 Memory is provided via a stack. If a block is executed, memory forthe local names is allocated. This memory will be deallocatedafter termination of the block.
2 Alternatively, it is possible to provide the memory for all blocks of aprocedure at the same time. Memory can be decided at compiletime (except if there is variable memory)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 216 / 264
Global Data
memory space can be allocated statically
procedures can be passed as parameter (C: pointer)
Example:
program pass(input, output);var m : integer;function f(n: integer) : integer;
begin f := m + n end { f };function g(n: integer) : integer;
begin g := m * n end { g };procedure b(function h(n: integer) : integer);
begin write(h(2)) end { b };begin
m := 0;b(f); b(g); writeln
end.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 217 / 264
Nested Procedures
procedure definitions in proceduresExample:program sort(input,output)
. . .procedure exchange(i,j:integer);. . .procedure quicksort(m,n: integer);
var k,v: integer;begin
. . .exchange(i,j);. . .
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 218 / 264
Nesting Depth, Access Lists
Nesting Depth: depth of the nesting ofprocedures/blocks. . . (programms: 1, procedure in program: 2, . . . )Access List: implementation of the access to nested procedures
access link: field in the activation record of a procedureIf P is declared in Q, the access link of P point to the access link ofQ (in the last activation of Q)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 219 / 264
Example – Access Lists
e(1,3)
i,j
p(1,3)
k,v
q(1,3)
access link
k,v
q(1,9)
a,x
s
access link
access link
access link
access link
access link
s
a,x
q(1,9)
k,v
access link
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 220 / 264
Search for non-local names
Procedure P is in nesting depth np and accesses a with na ≤ np.1 P is being executed. The activation record of P is located at the top
of the stack. Walk along np − na access links.2 Subsequently, we reach the activation record, which contains a. An
offset value returns the actual position of a.
(nP − na, offset) defines the address of a. Computation can bedone at compile time.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 221 / 264
Procedure Access (nested)
P calls procedure X.1 nP < nX : X lies lower than P . Therefore, X must be defined in P
(or X can not be accessed from P ). The access link of X points toP .
2 nP ≥ nX : Walk along nP − nX + 1 access links. This activationrecord contains P as well as X.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 222 / 264
Procedure Parameter
passing a procedure as parameternot allowed in all languageshandling of links (similar to already described method):
assuming c calls b and passes f as parametera link from f to c is computedthis link is used as access link when f is actually called.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 223 / 264
Dynamic Scope
New activation uses existing binding of non-local names in theirmemory. a in the called activation references the same memorylike the calling activation. New bindings are provided for localnames of the called procedure.Semantics of static scope and dynamic scope are different!
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 224 / 264
Example – Dynamic Scope (1)
program dynamic (input, output);var r: real;procedure show;
begin write(r : 5:3) end;procedure small;
var r : real;begin r := 0.125; show end;
beginr := 0.25;show; small; writeln;show; small; writeln;
end
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 225 / 264
Example – Dynamic Scope (2)
show
small
dynamic
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 226 / 264
Example – Dynamic Scope (3)
Example:Static Scope: Output = 0.250 0.250 nl 0.250 0.250 nlDynamic Scope: Output = 0.250 0.125 nl 0.250 0.125 nl
Deep access: Control links are used as access links. Search in thestack (from top to bottom) for the first entry of a non-local name.Shallow access: Current value of a name is deposited in a(statically) allocated location. At an activation of P , local name nuses the location. The old value of the location can be cached inthe activation record and is therefore restorable.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 227 / 264
Parameter Passing
How are parameters passed at a call?
procedure exchange(i,j: integer);var x: integer;begin
x := a[i]; a[i] := a[j]; a[j] := xend
Call-by-ValueCall-by-ReferenceCopy-RestoreCall-by-Name
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 228 / 264
Call-by-Value
Formal parameters are considered as local names.The caller evaluates the actual parameter and passes them theassociated formal parameters.Pointers can also be passed as values.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 229 / 264
Call-by-Reference
Instead of the value (as in Call-by-Value) a pointer to the memorylocation of the actual parameter is passed.var parameter in PASCAL are references.Arrays are usually passed as reference.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 230 / 264
Copy-Restore
Hybrid method between Call-by-Value and Call-by-Reference.Method:
1 Before executing the procedure, the actual parameters areevaluated. The R-Values (values) are passed to the respectiveformal parameters (as Call-By-Value). Additionally, the L-Values(locations) are computed.
2 After procedure termination, the actual R-Values are copied back tothe L-Values of the actual parameters (if available).
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 231 / 264
Call-by-Name
Procedures are treated as macros, i.e. instead of the procedurecall, the body of the procedure is substituted; all formalparameters in the body are replaced by the actual parameters.(Macro-Expansion)The local names of the called procedures must be different to thename of the calling procedure. (Variable renaming is partiallynecessary)The actual parameters are put in braces to avoid problems.Problems: Call swap(i,a[i]) is expanded to:temp := i; i := a[i]; a[i] := tempInstead of a[i]=i we write a[a[i]]=i. temp := x; x := y;y := temp
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 232 / 264
Symbol Table
entries correspond to the declaration of namesstore binding and scope informationstorage allocation information (that is needed at run time)storage of the symbol table in a liststorage of the symbol table in a hash table
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 233 / 264
PART 7 - INTERMEDIATE CODE GENERATION
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 234 / 264
Objectives/Tasks
provide a target machine independent formatadvantages:
easy adaption on different target machinesmachine independent code optimization can be realized
attributed grammars can be used
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 235 / 264
Languages
graphical representation (syntax tree)Three-Address Codex := y op z
x,y,z are arbitrary numbers, constants, names (variables), ortemporary variablesop is an operator
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 236 / 264
Three-Address Code Statements
1 assignments x := y op t
2 assignments with unary operator x := op y
3 copy statements x := y
4 unconditional jumps goto L
5 conditional jumps if x relop y goto L
6 param x,call p, n calls procedure p with n parameters.return y where y is optional.
7 assignments with indices: x := y[i] und x[i]:=y.8 addresses and pointer assignments: x := & y, x := *y and *x:= y
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 237 / 264
Generating the Three-Address Code (simplified)
S-attributed grammarS.code represents the three address codeE.place the name which contains the value of the non-terminal E.E.code the sequence of three address code statements, thatevaluate E.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 238 / 264
Assignments
S → id := E S.code := E.code||gen( id.place ’:=’ E.place)E → E1 + E2 E.place := newtemp;
E.code := E1.code||E2.code||gen(E.place ’:=’ E1.place ’+’ E2.place)E → E1 * E2 E.place := newtemp;
E.code := E1.code||E2.code||gen(E.place ’:=’ E1.place ’*’ E2.place)E → - E1 E.place := newtemp;
E.code := E1.code||gen(E.place ’:=’ ’uminus’ E1.place)E → ( E1 ) E.place := E1.place;E.code := E1.codeE → id E.place := id .place;E.code := ”
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 239 / 264
If-then-else
Statement: S → if E then S1 else S2Generated Code:S.else := newlabel;S.after := newlabel;
S.code :=
E.code||gen( ’id’ E.place ’=’ ’0’ ’goto’ S.else)||S1.code||gen( ’goto’ S.after)||gen(S.else ’:’ )||S2.code||gen(S.after ’:’ )
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 240 / 264
While-loops
Statement: S → while E do S1
Generated Code:S.begin := newlabel;S.after := newlabel;
S.code :=
gen(S.begin ’:’)||E.code||gen( ’if’ E.place ’=’ ’0’ ’goto’ S.after)||S1.code||gen( ’goto’ S.begin)||gen(S.after ’:’)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 241 / 264
Implementation of the Three-Address Code
Quadruple = record with 4 fields:
op Operatorarg1 1. Argumentarg2 2. Argumentresult Temp. Variable
arguments and temporary variables are usually pointer to symboltable entriesTriple = “Quadruple” without result field. instead, the position ofthe triple that calculates a value is stored in the argument.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 242 / 264
Examples
a := b * -c + b * -cQuadruple
op arg1 arg2 result
(0) uminus c t1(1) * b t1 t2(2) uminus c t3(3) * b t3 t4(4) + t2 t4 t5(5) := t5 a
Tripleop arg1 arg2
(0) uminus c(1) * b (0)(2) uminus c(3) * b (2)(4) + (1) (3)(5) := a (4)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 243 / 264
Declarations
provision of the memory space for local names of a procedure(relative addresses of the activation record or memory of the staticdata area)procedure declarations:
1 offset . . . next free relative address2 initialization offset = 03 offset is used for the current data object4 then, the offset is increased by the size of the current data object
enter(name, type, offset) creates an entry in the symbol table forname, assigns it the type type and offset as relative address.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 244 / 264
Attributed Grammar for Declarations
P → {offset := 0}DD → D ; DD → id : T{enter( id.name, T.type, offset)
offset := offset+ T.width}T → integer{T.type := integer;T.width := 4}T → real{T.type := real;T.width := 8}T → array [ num ] of T1{T.type := array(num.val, T1.type);
T.width := num.val× T1.width}T →↑ T1{T.type := pointer(T1.type);T.width := 4}
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 245 / 264
Scope Information
each procedure has its own symbol tablefor every procedure declaration a symbol table is created+ a link to the symbol table of the enclosing procedureoffset is now local!example grammar:P → DD → D ; D| id : T | proc id ; D ; S
grammar definition. . . (Exercise)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 246 / 264
Records, Field Names
create symbol table for fields (marker L).names are stored in the new symbol tablegrammar:T → record LD end {T.type := record(top(tblptr));
T.width := top(offset)pop(tblptr); pop(offset)}
L→ ε {t := maketable(nil);push(t, tblptr); push(0, offset)}
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 247 / 264
Assignments
assumption so far: names are represented by oneselfcorrect if name for pointer is in their symbol tablegeneralization by attribute: name for identifierlookup(id.name) results in the entryadvantage: usable even if the entry is declared in an enclosingprocedureBy defining lookup, the scope of a language regarding anidentifiers is defined.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 248 / 264
Grammar
S → id := E p := lookup(id.name);if p 6= nil then
emit(p′ :=′ E.place)else error
E → E1 + E2 E.place := newtemp;emit(E.place′ :=′ E1.place′ +′ E2.place)
E → E1 * E2 E.place := newtemp;emit(E.place′ :=′ E1.place′ ∗′ E2.place)
E → - E1 E.place := newtemp;emit(E.place′ :=′ ′uminus′E1.place)
E → ( E1 ) E.place := E1.placeE.place := newtemp
E → id p := lookup( id.name)if p 6= nil then
E.place := pelse error
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 249 / 264
Addressing Array Elements
array access fast, if the elements are stored in one block.
access to the element at position i (w . . . element size):
base+ (i− low)× w
can be rewritten to:i× w + (base− low × w)
advantage: base− low × w = c can be calculated at compile time!
two dimensional arrays (A(i1, i2)):
row major (row-by-row) base+ ((i1 − low1)× n2 + i2 − low2)× wwhere n2 = high2 − low2 + 1.column major (column-by-column)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 250 / 264
Translation Schema for Array Access (1)
1 S → L := E
if L.offset = null thenemit(L.place′ :=′ E.place);
elseemit(L.place ’[’ L.offset ’]’ ’:=’ E.place)
2 E → E1 + E2
E.place := newtempemit(E.place ’:=’ E1.place ’+’ E2.place)
3 E → ( E1 )E.place := E1.place
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 251 / 264
Translation Schema (2)
1 E → L
if L.offset = null thenE.place := L.place
elseE.place := newtempemit(E.place ’:=’ L.place ’[’ L.offset ’]’)
2 L→ Elist ]L.place := newtempL.offset := newtempemit(L.place ’:=’ c(Elist.array))emit(L.offset ’:=’ Elist.place ’*’ width(Elist.array))
3 L→ idL.place := id.place;L.offset := null
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 252 / 264
Translation Schema (3)
1 Elist→ Elist1, E
t := newtempm := Elist1.ndim+ 1emit(t ’:=’ Elist1.place ’*’ limit(Elist1.array,m))emit(t ’:=’ t ’+’ E.place)Elist.array := Elist1.arrayElist.ndim := m
2 Elist→ id [ EElist.array := id.placeElist.place := E.placeElist.ndim := 1
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 253 / 264
Boolean Expressions
2 main tasks:1 calculating logical values2 changing program procedure
grammar:
E → E or E|E and E| not E| ( E ) | id relop id |true|false
2 methods to represent boolean values:1 true and false are coded as numbers (e.g. true = 1, false = 0).2 Flow-of-Control: values are represented as positions in the code
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 254 / 264
Numeric Representation
Example 1: a or (b and (not c))
t1 := not ct2 := b and t1t3 := a or t2
Example 2: a < b
100: if a < b then goto 103101: t := 0102: goto 104103: t := 1104:
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 255 / 264
Translation Schema (Bool. Expr.) I
E → E1 or E2 E.place := newtemp; emit(E.place ’:=’ E1.place ’or’ E2.placeE → E1 and E2 E.place := newtemp; emit(E.place ’:=’ E1.place ’and’ E2.placeE → not E1 E.place := newtemp; emit(E.place ’:=’ ’not’ E1.placeE → id1 relop id2 E.place := newtemp
emit( ’if’ id1.place relop.opid2.place ’goto’ nextstat+ 3)emit(E.place ’:=’ ’0’emit(’goto’ nextstat+ 2)emit(E.place ’:=’ ’1’
E → true E.place = newtemp; emit(E.place ’:=’ ’1’)E → false E.place = newtemp; emit(E.place ’:=’ ’0’)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 256 / 264
Short-Circuit Code
Representation of the boolean expressions without generating code foroperators and, or, not
values are represented by positions in the code
Jumping Code
Example: a < b or c < d and e < f
100: if a < b goto 103101*: t1 := 0102: goto 104103*: t1 := 1104: if c < d goto 107105*: t2 := 0106: goto 108
107*: t2 := 1108: if e < f goto 111109*: t3 := 0110: goto 112111*: t3 := 1112*: t4 := t2 and t3113*: t5 := t1 or t4
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 257 / 264
Flow-of-Control Statements
Statements:
S →if E then S1if E then S1 else S2while E do S1
use labels to represent true and falsedependent on the evaluation of E, branch out.attributed grammar (see above)
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 258 / 264
Control-Flow Translation
Use E.true (E.false) if E evaluates to true.Example E1 or E2 is true, if E1 is true.not all expressions are evaluated (like e.g. in C)Example: a < b or (c < d and e < f)
if a < b goto Ltruegoto L1
L1: if c < d goto L2goto Lfalse
L2: if e < f goto Ltruegoto Lfalse
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 259 / 264
Translation Schema (Bool. Expr.) II
E → E1 or E2 E1.true := E.true;E1.false := newlabel;E2.true := E.trueE2.false := E.false;E.code := E1.code||gen(E1.false’:’ ||E2.code
E → E1 and E2 E1.true := newlabel;E1.false := E.false;E2.true := E.trueE2.false := E.false;E.code := E1.code||gen(E1.true’:’ ||E2.code
E → not E1 E1.true := E.false;E1.false := E.true;E.code := E1.code
E → id1 relop id2 E.code :=
gen
(’if’ id1.placerelop.opid2.place ’goto’ E.true
)||
gen(’goto’ E.false)E → true E.code = ’goto’ E.trueE → false E.code = ’goto’ E.false
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 260 / 264
Mixed Mode
consideration (so far) simplifiedin practice, mixed expressions are possibleExample 1: (a + b) < c
Example 2: (a < b) + (b < a)
introduce synthetic attribute E.type
E.type =
{arith Arithmetic expressionbool Boolean expression
Code Generation for E + E, E ∗ E, . . . needs to be changed.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 261 / 264
Back patching
easiest implementation of attributed grammars:1 generate a syntax tree2 generate the translation depth-first
problem with Single Pass:labels for control flow are unknown
trouble-shooting:1 jump statements are generated with empty labels2 these statements are saved in a list3 the target labels are registered once they are known
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 262 / 264
Procedure Calls
usage of run time routines for handling the parameters, the callitself and the return of values.grammar:
S → call id (Elist)Elist→ Elist, E|E
Calling Sequence must be reproduced.
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 263 / 264
Attributed Grammar (simplified)
Call-by-ReferenceMemory is statically allocatedgrammar:
1 S → call id (Elist)
for each item p on queue doemit(’param’ p)
emit(’call’ id.place)2 Elist→ Elist, E
Append E.place to the end of queue3 Elist→ E
Initialize queue to contain only E.place
F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 264 / 264