PART 4 - SYNTAX DIRECTED · PDF filePART 4 - SYNTAX DIRECTED TRANSLATION F. Wotawa ......

PART 4 - SYNTAX DIRECTED TRANSLATION

F. Wotawa (IST @ TU Graz) Compiler Construction Summer term 2015 109 / 264

Setting

Translation of context-free languagesInformation↔ attributes of grammar symbolsValues of attributes are defined by “semantic rules”2 possibilities:

Syntax directed definitions (high-level spec)Translation schemes (implementation details)

Evaluation: (1) Parse input, (2) Generate parse tree, (3) Evaluateparse tree


Syntax directed definitions

Generalization of context-fee grammars

Each grammar symbol has a set of attributes

Synthesized vs. inherited attributes

Attribute: string, number, type, memory location, . . .

Value of attribute is defined by semantic rules

Synthesized: Value of child node in parse treeInherited: Value of parent node in parse tree

Semantic rules define dependencies between attributes

Dependency graph defines calculation order of semantic rules

Semantic rules can have side effects


Form of a syntax directed definition

Grammar production: A→ α

Associated semantic rule: b := f(c1, . . . , ck)

f is a functionSynthesized: b is a synthesized attribute of A and c1, . . . , ck aregrammar symbols of the productionInherited: b is an inherited attribute of a grammar symbol on theright side of the production and c1, . . . , ck are grammar symbols ofthe productionb depends on c1, . . . , ck


Example

“Calculator”-program: val is a synthesized attribute fornonterminals E, T and F

Production Semantic RuleL→ En print(E.val)E → E1+T E.val := E1.val + T.valE → T E.val := T.valT → T1*F T.val := T1.val ∗ F.valT → F T.val := F.valF → (E) F.val := E.valF → digit F.val := digit.lexval


S-attributed grammar

Attributed grammar exclusively using synthesized attributesExample-evaluation: 3*5+4n (annotated parse tree)

L

E.val=19 n

T.val=4

digit

F.val=4

.lexval=4

E.val=15+

T.val=15

*T.val=3

digit.lexval=3

F.val=3

F.val=5

digit.lexval=5


Inherited attributes

Definition of dependencies of program language constructs andtheir contextExample: (type checking)

Production Semantic RuleD → TL L.in := T.typeT → int T.type := integerT → real T.type := realL→ L1, id L1.in := L.in

addtype(id.entry, L.in)L→ id addtype(id.entry, L.in)


Inherited attributes – Annotated parse tree

real id1, id2, id3

D

T.type=real

real

L.in=real

L.in=real

L.in=real

id

id

1

2

id3

,

,


Dependency graphs

Show dependencies between attributesEach rule is represented in the form b := f(c1, . . . , ck)

Nodes correspond to attributes; edges to dependenciesDefinition:for each node n in the parse tree do

for each attribute a of the grammar symbol at n doconstruct a node in the dependency graph for a

for each node n in the parse tree dofor each semantic rule b := f(c1, . . . , ck) associated

with the production used at n dofor i := 1 to k do

construct an edge from the node for ci to the node for b


Dependency graph – Example

D

real

id

id

1

2

id3

,

,entry

entry

entry

T L

L

Lin

in

intype

1

2

3

4

56

7

9

10

8


Topological sort

Arrangement of m1, . . . ,mk nodes in a directed, acyclic graph whereedges point from smaller nodes to bigger nodesIf mi → mj is an edge, then the node mi is smaller than the node mj

Important for order in which the attributes are calculated

Example (cont.):1 a4 := real2 a5 := a43 addtype(id3.entry, a5)4 a7 := a55 addtype(id2.entry, a7)6 a9 := a77 addtype(id1.entry, a9)


Example - syntax trees

Abstract syntax tree = simplified form of a parse treeOperators and keywords are supplied to intermediate nodes byleaf nodesProductions with only one element can collapseExamples:

if-then-else

B S S1 2

3 5

4

+

*


Syntax trees – Expressions

Functions (return value: pointer to new node):

mknode(op, left, right): node label op, 2 child nodes left, rightmkleaf(id, entry): leaf id, entry in symbol table entrymkleaf(num, val): leaf num, value val

Syntax directed definition:Production Semantic RuleE → E1 + T E.nptr := mknode(′+′, E1.nptr, T.nptr)E → E1 − T E.nptr := mknode(′−′, E1.nptr, T.nptr)E → T E.nptr := T.nptrT → (E) T.nptr := E.nptrT → id T.nptr := mkleaf(id, id.entry)T → num T.nptr := mkleaf(num,num.val)


Syntax trees – Expressions (ex.)

Syntax tree for a-4+cE nptr

T nptrE nptr

T nptr-

+

E

T nptr

id

num

id

id num

’-’

’+’

id

4

to entry for a

to entry for c


Evaluation of S-attributed definitions

Attributed definition exclusively using synthesized attributesEvaluation using bottom-up parser (LR-parser)Idea: store attribute information on stack

State Val. . . . . .X X.x

Y Y.y

top→ Z Z.z

. . . . . .

Semantic rule:A.a := f(X.x, Y.y, Z.z)Production: A→ XY ZBefore XY Z is reduced to A, valueof Z.z stored in val[top], Y.y storedin val[top− 1], X.x in val[top− 2]


Example - S-attributed evaluation

“Calculator”-example:

Production Code FragmentL→ En print(val[top− 1])E → E1 + T val[ntop] := val[top− 2] + val[top]E → TT → T1 ∗ F val[ntop] := val[top− 2] ∗ val[top]T → FF → (E) val[ntop] := val[top− 1]F → digit

Code executed before reduction

ntop = top− r + 1, after reduction: top := ntop


Result for 3*5+4n

Input state val Production used3*5+4n

*5+4n 3 3*5+4n F 3 F → digit*5+4n T 3 T → F5+4n T * 3

+4n T * 5 3 5+4n T * F 3 5 F → digit+4n T 15 T → T ∗ F+4n E 15 E → T

4n E + 15n E + 4 15 4n E + F 15 4 F → digitn E + T 15 4 T → Fn E 19 E → E + T

E n 19L 19 L→ En


L-attributed definitions

Definition: A syntax directed definition is L-attributed if each inheritedattribute of Xj , 1 ≤ j ≤ n, on the right side of A→ X1, . . . , Xn is onlydependent on:

1 the attributes X1, . . . , Xj−1 to the left of Xj and2 the inherited attributes of A

Each S-attributed grammar is a L-attributed grammar

Evaluation using depth-first orderprocedure dfvisit(n : node)

for each child m of n, from left to right doevaluate inherited attributes of mdfvisit(m)

endevaluate synthesized attributes of n

end


Translation schemes

Translation scheme = context-free language with attributes forgrammar symbols and semantic actions which are placed on theright side of a production between grammar symbols and areconfined within {}Example:

T → T1 ∗ F{T.val := T1.val ∗ F.val}If only synthesized attributes are used, the action is always placedat the end of the right side of a productionNote: Actions may not access attributes which are not calculatedyet (limits positions of semantic actions)


Translation schemes (cont.)

If both inherited and synthesized attributes are used the followingneeds to be taken into consideration:

1 An inherited attribute of a symbol on the right side of a productionhas to be calculated in an action which is positioned to the left ofthe symbol

2 An action may not reference a synthesized attribute belonging to asymbol which is positioned to the right of the action

3 A synthesized attribute of a nonterminal on the left side can only becalculated if all referenced attributes have already been calculated⇒ actions like these are usually placed at the end of the right side


Example translation scheme

S → A1A2 {A1.in := 1;A2.in := 2}A→ a {print(A.in)}

Above grammar does not fulfill the three conditions for translationschemesThe inherited attribute A.in is not yet defined at the point in timewhen it should be printedBut: For each L-attributed grammar a translation scheme can befound which fulfills the three conditions, e.g.:

S → {A1.in := 1} A1 {A2.in := 2} A2

A→ a {print(A.in)}


Top-down translation

Removal of left recursions in translation scheme is necessary

Example:

E → E1 + T {E.val := E1.val + T.val}E → E1 − T {E.val := E1.val − T.val}

E → T {E.val := T.val}T → (E) {T.val := E.val}T → num {T.val := num.val}


Example top-down translation

E → T {R.i := T.val}R {E.val := R.s}

R→ +T {R1.i := R.i+ T.val}R1 {R.s := R1.s}

R→ −T {R1.i := R.i− T.val}R1 {R.s := R1.s}

R→ ε {R.s := R.i}T → (

E) {T.val := E.val}

T → num {T.val := num.val}


Evaluation of 9-5+2

E

-

T.val = 9 R.i = 9

T.val = 5 R.i = 4.val = 9num

.val = 5num + T.val = 2 R.i = 6

.val = 2num ε


Summary transformation

Given translation scheme:A→ A1Y {A.a := g(A1.a, Y.y)}A→ X {A.a := f(X.x)}

After removal of left recursions:A→ XRR→ Y R|ε

Transformed scheme:A→ X {R.i := f(X.x)}

R {A.a := R.s}R→ Y {R1.i := g(R.i, Y.y)}

R1 {R.s := R1.s}R→ ε {R.s := R.i}


Predictive parsing with schemesInput: syntax-directed translation scheme; Outp.: Syntax-directed translator

1 For each nonterminal A, construct a function that has a formal parameterfor each inherited attribute of A and that returns the values of thesynthesized attributes of A. This function has a local variable for eachattribute of each grammar symbol that appears in a production for A.

2 As previously described (see predictive parsing), the code fornonterminal A decides what production to use based on the currentinput symbol.

3 The code for each production does the following (evaluation from left toright):

1 Token X with synthesized attribute x: Save the value of x in avariable X.x. Generate a call to match token X.

2 Nonterminal B: Generate c := B(b1, . . . , bk); b1, . . . , bk variables forinherited attributes of B; c variable for synthesized attribute of B.

3 For an action, copy the code into the parser, replacing eachreference to an attribute by the variable for that attribute.


Example - predictive parsing

Grammar:E → T {R.i := T.val}

R {E.val := R.s}R→ op

T {R1.i := mknode(op.lexeme,R.i, T.nptr)}R1 {R.s := R1.s}

R→ ε {R.s := R.i}T → (

E) {T.val := E.val}

T → num {T.val := num.val}

Functions:function E : nodefunction R(i : node) : nodefunction T : node


Parsing procedure R

Procedure without translation scheme

procedure R()begin

if lookahead = op then beginmatch(op);T ();return R()

end else beginreturn;

endend


Parsing function R

function R (i: node) : nodevar nptr, i1, s1, s: node; oplexeme : char;

beginif lookahead = op then begin

oplexeme := lexval;match(op);nptr := T ();i1 := mknode(oplexeme,i,nptr);s1 := R(i1);s := s1

end elses := i;

return send


Bottom-up with inherited attribute

Implementation of L-attributed grammars in bottom-up parsersFor LL(1)-grammars and many LR(1)-grammarsRemoval of embedding actions from translation schemes:

Actions have to be placed at end of right side of a productionEnsured by new marker nonterminals

Example:E → TRR→ +T{print(′+′)}R| − T{print(′−′)}R|εT → num{print(num.val)}

E → TRR→ +TMR| − TNR|εT → num{print(num.val)}M → ε{print(′+′)}N → ε{print(′−′)}


Inherited attributes on the stack

Idea: Production A→ XY , synthesized attribute X.x and inheritedattribute Y.y

Before a reduction (of X Y ), X.x is on the stackIn the case of Y.y = X.x (copy action), the value of X.x can beused whenever the value of Y.y is required

Example: Parser for variable declarations real p,q,r


Variable declaration - example

D → T {L.in := T.type}L

T → int {T.type := integer}T → real {T.type := real}L→ L1 {L1.in := L.in}

,id {addtype(id.entry, L.in)}

L→ id {addtype(id.entry, L.in)}

D

T

in

Lreal

Ltype

r,

L

in

in

p

q,


Calculation using the stack

Input state Production Usedreal p,q,r

p,q,r realp,q,r T T → real,q,r T p,q,r TL L→ idq,r TL ,

,r TL , q,r TL L→ L, idr TL ,

TL , rTL L→ L, idD D → TL

Implementation:Production Code FragmentD → TLT → int val[top] := integerT → real val[top] := realL→ L, id addtype(val[top], val[top− 3])L→ id addtype(val[top], val[top− 1])


Problems

Positions of attributes on the stack need to be known

Production Semantic RuleS → aAC C.i := A.sS → aABC C.i := A.sC → c C.s := g(C.i)

When the reduction C → cis conducted, it is unknownwhether the value of C.i islocated in val[top − 1] or inval[top − 2]! It depends onwhether a B is located on thestack.

Solution: Introduction of a marker M :S → aAC C.i := A.sS → aABMC M.i := A.s;C.i := M.sC → c C.s := g(C.i)M → ε M.s := M.i


Problems (cont.)

Simulation of semantic rules which are no copy actionsUsage of marker!

S → aAC C.i := f(A.s)S → aANC N.i := A.s;C.i := N.sN → ε N.s := f(N.i)


Bottom-up parsing . . .

. . . with calculation of inherited attributesInput: L-attributed definition (and LL(1)-grammar)Output: Parser, which calculates attribute values on stack

1 Assumptions: Each nonterminal A has an inherited attribute A.i, eachgrammar symbol X has a synthesized attribute X.s. If X is a terminal,then X.s is the lexical value of X (supplied by the lexical analyser). Thevalues are stored on the stack in form of an array val.

2 For each production A→ X1 . . . Xn create n new markers (nonterminals)M1, . . . ,Mn and replace the production with A→M1X1 . . .MnXn.Note: synthesized values for Xi are stored in the val array entry, whichbelongs to Xi. Inherited values Xi.i are stored in entries which areassociated to Mi.

3 Invariant: The new inherited attribute A.i (if existing) is always directlybeneath the position of M1 within the val array.


Simplifications

Reduction of markers:1 If Xj has no inherited attribute, then no marker Mj is required⇒ positions of attributes on the stack are shifting!

2 If X1.i exists and is calculated by X1.i = A.i, then M1 is notrequired


Removal of inherited attributes

Replacement of inherited attributes by synthesized onesNot always possibleRequires modification of grammar!Example: Declarations in PascalD → L : TT → integer|charL→ L, id|id

convert to:D → idLL→, idL| : TT → integer|char


Difficult syntax directed definition

The following definition cannot be processed by bottom-up parsersusing current approachesS → L L.count := 0L→ L11 L1.count := L.count+ 1L→ ε print(L.count)

Reason: L→ ε receives the number of 1s by means of inheritanceHowever, as L→ ε is used in the reduction first, no value is specifiedyet!


Recursive evaluators

Evaluation of attributesBased on parse treeNot possible in conjunction with parsingOrder of nodes which are visited during evaluation is arbitraryFor each nonterminal a translation function existsExtensions may visit nodes more than onceOrder of node visits needs to regard the following:

1 Each inherited attribute of a node has to be calculated before thenode is visited

2 Synthesized attributes are calculated before the node is left (for thelast time)

Order is determined by dependencies


Example – Recursive evaluators

Production Semantic RulesA→ LM L.i := l(A.i)

M.i := m(L.s)A.s := f(M.s)

A→ QR R.i := r(A.i)Q.i := q(R.s)A.s := f(Q.s)

A

L M

A

Q R

i s

i i i

i

ss s s

s

i

function A(n, ai)if production(n) = ′A→ LM ′ then

li := l(ai)ls := L(child(n, 1), li)mi := m(ls)ms := M(child(n, 2),mi)return f(ms)

if production(n) = ′A→ QR′ thenri := r(ai)rs := R(child(n, 2), ri)qi := q(rs)qs := Q(child(n, 1), qi)return f(qs)


PART 5 - TYPE CHECKING


Static Program Checking

Type Checks Check of the used type. Error if operands areincompatible with the used operator. Example: 1.2 + 2(real + int).Flow-of-Control Checks Check if the transfer of the programexecution is possible. Example: break needs an enclosing loop.goto label needs a defined label.Uniqueness Checks Check if an object has been defined exactlyonce. Example: In Pascal each identifier must be unique.Name-related Checks In some languages, names (e.g. forprocedures) are used which need to occur at a different location(e.g. at the end of a procedure).


Tasks

Check if the type system of the language is satisfied.

Separate type checker is not always necessary.

parser typechecker

intermediatecode

generator

token stream parse tree syntax tree

representationintermediate

Typesystems (Examples):

“If both operands of the arithmetic operators of addition, subtractionand multiplication are of type integer, then the result is of typeinteger”“The result of the unary & operator is a pointer to the object referredto by the operand. If the type of the operand is ’...’, the type of theresult is ’pointer of ...’.”


Type Expressions

A type expression is:

1 a Basic Type integer, boolean, char, and real as well as a special BasicType type error or void.

2 the Type Name

3 a composite type in the form of:

1 Arrays. array(I, T ); set of indexes I, type T2 Products. T1 × T23 Records. record((N1 × T1)× . . .× (Nk × Tk)); name Ni, types Ti4 Pointers. pointer(T )5 Functions. T1 → T2

4 and type variables.


Types – Examples

type row = recordaddress: integer;lexeme: array[1..15] of char

end;var table: array[1..101] of row;row can be represented asrecord((address× integer), (lexeme× array(1..15, char))).function f(a,b: char): ↑ integer;is represented as: char × char → pointer(integer).


Graphical Representation of Types

as DAG (Directed Acyclic Graph)→

× pointer

char integer

or as a tree→

× pointer

char char integer


Typesystems

Set of Rules

specified using attributed grammars (or verbally)

Static vs. Dynamic Checking of Types

Sound Typesystem = static type checking is sufficient

Language is strongly typed = the compiler guarantees that an acceptedprogram runs without type errors.

But some checks can only occur dynamically

table: array[0..255] of char;i: integer;

The correctness of the call table[i] in the program can not bechecked by the compiler.

Error Recovery is important (even for type errors)


1. Secure Type Info

Production Semantic RuleP → D ; ED → D ; DD → id : T {addtype( id.entry, T.type)}T → char {T.type := char}T → integer {T.type := integer}T → array [ num ] of T1 {T.type := array(1 . . .num.val, T1.type)}T →↑ T1 {T.type := pointer(T1.type)}


2. Type Checking – Expressions

Production Semantic RuleE → literal E.type := charE → num E.type := integerE → id E.type := lookup( id.entry)

E → E1 mod E2 E.type :=

if E1.type = integer andE2.type = integer then integer

else type error

E → E1[E2] E.type :=

if E1.type = array(s, t) andE2.type = integer then t

else type error

E → E1 ↑ E.type :=

{if E1.type = pointer(t) then telse type error


3. Type Checking – Statements

Production Semantic Rule

S → id := E S.type :=

{if id.type = E.type then voidelse type error

S → if E then S1 S.type :=

{if E.type = boolean then S1.typeelse type error

S → while E do S1 S.type :=

{if E.type = boolean then S1.typeelse type error

S → S1 ; S2 S.type :=

if S1.type = void and

S2.type = void then voidelse type error


4. Type Checking – Functions

Syntax Extension: T → T ’→ ’ T DefinitionE → E ( E ) FunctionCall

Type Extraction + Type Checking:

Production Semantic RuleT → T1 ’→ ’ T2 T.type := T1.type→ T2.type

E → E1 ( E2 ) E.type :=

if E1.type = s→ t and

E2.type = s then telse type error

Example:root : ((real → real) × real) → real


Type Equivalence

When are types equivalent???

structural equivalencename equivalence


Structural Equivalence

(1) function sequiv(s, t) : boolean; begin(2) if s and t are the same basic type then(3) return true(4) else if s = array(s1, s2) and t = array(t1, t2) then(5) return s1 = t1 and sequiv(s2, t2)(6) else if s = s1 × s2 and t = t1 × t2 then(7) return sequiv(s1, t1) and sequiv(s2, t2)(8) else if s = pointer(s1) and t = pointer(t1) then(9) return sequiv(s1, t1)(10) else if s = s1 → s2 and t = t1 → t2 then(11) return sequiv(s1, t1) and sequiv(s2, t2)(12) else return false end


Encoding of Type Expressions

Expression as Bit vector (efficient storage and comparison)Example:

Type Constructor Encodingpointer 01array 10

freturns 11

Basic Type Encodingboolean 0000char 0001integer 0010real 0011

Type expression Encodingchar 00 00 00 0001

freturns(char) 00 00 11 0001pointer(freturns(char)) 00 01 11 0001

array(pointer(freturns(char))) 10 01 11 0001


Name vs. structural Equivalence

Example (Pascal Programm)type link = ↑ cell;var next: link;

last : link;p : ↑ cell;q,r : ↑ cell;

Do all variables have the same type?Depends on the typesystem (and the compiler in pascal!)Implementation of the above example creates implicit types (e.g.type np : ↑ cell for variable p).


Cyclic Typedefinition (Example)

type link = ↑ cell;cell = record

info: integer;next: link;

end;

cell = record

integerinfo next pointer

cell

cell = record

integerinfo next pointer


Type conversion / Coercions

Statement of the problem: x+i with x as a real- and i as an integervariable.

There exist only operators for (real + real) or (int + int)

Type conversion necessary! x = int2real(i)

Implicit (by the compiler) or explicit (by the programmer) possible

Implicit = Coercion

Loss of information should be prevented (int→ real but not real→ int).

Performance!!!for I := 1 to N do X[I] := int2real(1) (PASCAL; X is anarray of reals) needs 48,4 µsfor I := 1 to N do X[I] := 1.0 needs only 5,4 µs.


Type conversion - Semantic Rules (1)

Production Semantic RuleE → id E.type := lookup(id.entry)

E.txt := id.entryE → E1 op E2 E.type := if E1.type = integer and E2.type = integer

then integerelse if E1.type = integer and E2.type = real

then realelse if E1.type = real and E2.type = integer

then realelse if E1.type = real and E2.type = real

then realelse type error


Type conversion - Semantic Rules (2)

Production Semantic RuleE → E1 op E2 E.txt := if E1.type = integer and E2.type = integer

then E1.txt ◦ E2.txtelse if E1.type = integer and E2.type = real

then int2real(E1.txt) ◦ E2.txtelse if E1.type = real and E2.type = integer

then E1.txt ◦ int2real(E2.txt)else if E1.type = real and E2.type = real

then E1.txt ◦ E2.txtelse type error

E → num E.type := integerE.txt := val

E → num.num E.type := realE.txt := val


Overloading

Symbols with different meaning (dependent on applicationcontext)

mathematics: + operator (integer, reals, complex numbers)ADA: ()-Expression for array access AND function calls

Overloading is resolved, when the meaning is clear (operatoridentification)Overloading can often be resolved by the types of operands.


Overloading - Possible Types

Example (ADA):function "*"(i,j: integer) return complex;function "*"(i,j: complex) return complex;

Possible types for * are:

integer × integer → integerinteger × integer → complexcomplex× complex→ complex

Assumption: 2,3,5 are integer

3*5 is either integer or complex.So 2 * (3 * 5) must be of type integer.(3*5)*z is of type complex, if z is of type complex.


Handling Overloading

Instead of a type the set of all possible types must be stored in anattribute.Attribute types!E′ → E E′.types = E.typesE → id E.types = {lookup( id.entry)}E → E1 ( E2 ) E.types = {t|∃s ∈ E2.types∧ s→ t ∈ E1.types}

Example: 3*5

i i i

i i c

c c c

E: {i,c}

E: {i} *: E: {i}

3: {i} 5: {i}

{

}

,

,


Uniqueness of Types

Expressions may only have one type (otherwise type error)Production Semantic RuleE′ → E E′.types := E.types

E.unique := if E′.types = {t} then t else type errorE → id E.types := {lookup( id.entry)}

E → E1 ( E2 )

E.types := {s′|∃s ∈ E2.types∧ (s→ s′) ∈ E1.types}t := E.uniqueS := {s|s ∈ E2.types∧ s→ t ∈ E1.types}

E2.unique := if S = {s} then s else type errorE1.unique := if S = {s} then s→ t else type error


Polymorphic Functions

Polymorphic Function = Function, whose argument may have a arbitrarytype

Polymorphic refers to functions and operators

Examples: Built-in operators for array-access, pointer manipulation

Reason for polymorphism:

Code can be used for various data structuresExample: finding the length of lists (e.g. ML)fun length(lptr) =

if null(lptr) then 0else length(tl(lptr))

length([sun,mon,tue]), length([1,2,3,4])not possible in PASCAL!


Type Variables

Variables, that allow us to talk about unknown types

Note: Type Variables as greek letters α, β, . . ..

Type Inference = problem of deciding the type of an expression takinginto account the application (of the expression).

Example

type link ↑ cell;procedure mlist ( lptr : link; procedure p)begin

while lptr <> nil do beginp(lptr);lptr := lptr↑.next

endend;

mlist: link × procedure →voidp: link → void


Example - Type Inference

Program:

function deref(p);begin

return p↑end;

Derivation:1 Type of p is β (Assumption)2 From p ↑ follows that p must be a pointer. Therefore it holds:β = pointer(α).

3 Furthermore, we know that the type of p↑ must be α.4 Therefore, it follows: ∀α : pointer(α)→ α is the type of the function

deref.


Example

deref : ∀α.pointer(α)→ α ;q : pointer(pointer(integer)) ;deref(deref(q))

deref0 : pointer(α0)→ α0

derefi : pointer(αi)→ αi q : pointer(pointer(integer))

apply : αi

apply : α0


Differences in Type Handling

In distinction from former type handling (without polymorphism):1 Arguments of polymorph functions in an expression may have

different types.2 The concept of type equivalence is different.pointer(α) = pointer(pointer(integer)) ???

3 Calculated Types must be used in further consequence. The effectof the unification of two expressions must be preserved.α is assigned the type t. If α is referenced elsewhere, t must beused!

Terms: Substitution, Instances, Unification


Substitution, Instances

Substitution = function that maps type variables to typeexpressions. S : type variables 7→ type expressions

Example: α 7→ pointer(integer)

Application of a substitution:function subst(t : type expression) : type expressionbegin

if t is a basic type then return telse if t is a variable then return S(t)else if t is t1 → t2 then return subst(t1)→ subst(t2)

endS(t) . . . Instance. We write s < t⇔ s is instance of t.


Examples

Instances:pointer(integer) < pointer(α)

pointer(real) < pointer(α)integer → integer < α→ α

pointer(α) < βα < β

No Instances:integer real substitution on Basic Types not possibleinteger → real α→ α inconsistent replacement of αinteger → α α→ α all occurrences must be replaced


Unification

2 types t1, t2 are unifiable if there exists a substitution S, so thatS(t1) = S(t2) holds.In praxis, we are interested in the Most General Unifier (MGU).

1 S(t1) = S(t2)2 Every substitution S′ with S′(t1) = S′(t2) must be an instance of S.


Checking Polymorphic Functions

2 Functions:1 fresh(t) replaces all Variables in the type expression t with new

variables. A pointer to the node representing the new expression isreturned.

2 unify(m,n) unifies the two expressions m and n. As a side effectthe substitution is performed.

Translation Schema:Production Semantic Rule

E → E1 ( E2 )p := mkleaf(newtypevar);unify(E1.type,mknode(

′→′, E2.type, p));E.type := p

E → E1 , E2 E.type := mknode(′×′, E1.type, E2.type)E → id E.type := fresh( id.type)


Example Type Checking

deref0 : pointer(α0)→ α0

derefi : pointer(αi)→ αi q : pointer(pointer(integer))→ β

apply : αi

apply : α0

Summary (Bottom-up type detection):Expression : Type Substitution

q : pointer(pointer(integer))derefi : pointer(αi)→ αi

derefi(q) : pointer(integer) αi = pointer(integer)deref0 : pointer(α0)→ α0

deref0(derefi(q)) : integer α0 = integer


Unification Algorithm

Input. A graph and a pair of nodes m and n, which should be unified.Output. True, if the nodes can be unified, False otherwise.Method. A node is represented by the record [constructor, left, right, set],where set is the Set of equivalent nodes. A node of set is chosen asrepresentative of this set. In the beginning, each set contains only the nodeitself.

find(n) returns the representative node

union(m,n) merges the equivalence sets. The new representative node is anode which does not correspond with a variable. If there existsno such node, a former representative node is chosen as thenew one.


Algorithm - Pseudocode

function unify(m,n : node) : boolean begins := find(m);t := find(n);if s = t then return trueelse if s and t are nodes that represent the same basic type then return trueelse if s is an op-node with children s1, s2 and

t is an op-node with children t1, t2 then beginunion(s, t);return unify(s1, t1) and unify(s2, t2) end

else if s or t represents a variable then beginunion(s, t)return true end

else return false end


Example – Unification

Type expression:((α1 → α2)× list(α3))→ list(α2)((α3 → α4)× list(α3))→ α5

..→ : 1.

× : 2

.

list : 8

.

→ : 3

.

α1 : 4

.

α2 : 5

.

list : 6

.

α3 : 7

.

→ : 11

.

α4 : 12

.

× : 10

. → : 9.

α5 : 14

.

list : 13

Question: unify(1, 9) =?


PART 6 - RUN-TIME ENVIRONMENT


Objectives/Tasks

Relate static source code to actions at program runtime.Names in the source code relate to (not necessarily the same)data objects on the target machine.Allocation and deallocation of data objects need to be managed(run-time support packages).procedure activationStore data objects accordingly to their data type.


Definitions

Procedure definition = name + bodyProcedures with return value = functionsProgram = procedure (e.g. main)Procedure name in one location in the code = procedure callVariables (identifier) in procedure definition = formal parametersArguments of a procedure call = actual parameters (substituteformal parameters after call)


Example Program

program sort(input,output)var a: array[0..10] of integer;procedure readarray;

var i: integer;begin .... end;

procedure partition(y,z: integer) : integer;var i,j,x,v : integer;begin .... end;

procedure quicksort(m,n : integer);var i: integer;begin

if (n>m) then begini := partition(m,n); quicksort(m,i-1); quicksort(i+1,n);

endend;

begina[0]:=-9999; a[10]:=9999;readarray; quicksort(1,9);

end.


Activation Trees

Assumptions:1 Sequential control flow2 Procedure activation starts at the beginning of the body. After

finishing the procedure, the statement located after the procedurecall is executed.

Activation = execution of the bodyLife time of a procedure = step sequence (time) from the first tothe last step during the procedure execution.Recursive procedure calls are possible (do not have to occurdirectly) P → Q→ . . .→ P


Activation Trees / Definition

1 each node represents a procedure activation2 the root node represents the activation of the main program3 node a is the parent node of b↔ control flow: In a, b is called

(activated)4 node a is left of node b↔. The lifetime of a is ahead of the lifetime

of b


Example Activation Tree

q(9,9)q(7,7)p(7,9)

q(7,9)q(5,5)p(5,9)

q(3,3)q(2,1)p(2,3)

q(2,3)q(1,0)p(1,3)

q(5,9)q(1,3)p(1,9)

q(1,9)r

s


Control Stack

Control flow = Depth processing (left to right) of the activation treeControl stack = stack to save all procedures at their lifetimeBeispiel:

q(2,3)q(1,0)p(1,3)

q(1,3)p(1,9)

q(1,9)r

s


Declarations / Scopes

Explicit or implicitvar i : integer;

scope of the variable given through Scope Rulesvariables can be used within the scopeglobal vs. local variablesvariables with the same name may denote different objects (due totheir scope)sequence of variable access (first local, then global variable if thename is identical,...)


Name Binding

data object: storage location, which can store a value

A variable (variable name) can reference different data objects duringruntime.

Program language semantics:

Environment: function that maps names to storage locationsState: function that maps the storage locations to values

Environment and State are different!!!!Example: pi is associated with the address 100, which stores the value0. After pi := 3.14 the storage location 100 has the value 3.14; pi,however, still points to 100.


Binding

A variable x is bound to storage location s, if the storage locationis associated with x.Location does not always have to be a (real) storage location (inthe main memory of the computer); e.g. complex datatypesCorrelation between STATIC and DYNAMIC notations:

STATIC NOTATION DYNAMIC COUNTERPARTdef. of a procedure activation of a procedure

declaration of a name bindings of the namescope of a declaration lifetime of a binding


Important Questions

. . . regarding organization of memory management and name binding.

1 Are there recursive procedures?

2 What happens with the values of local variables after finishing theprocedure execution.

3 Can a procedure reference non-local variables?

4 How are parameters passed to a procedure?

5 Can procedures be passed as parameters?

6 Can procedures be returned as a return value?

7 Is there dynamic memory allocation?

8 Does memory have to be deallocated explicitly? (or does this happenimplicitly (garbage collection)?)


Memory management

Run time memory for:1 generated target code2 data objects3 control stack for procedure activation

typical layout:

CodeStatic Data

Stack↓↑

Heap


Activation Record

. . . for information storage at a procedure call:1 temporary values (evaluation of expressions,..)2 local data (local variables,..)3 machine data to save (program counter, registers,..)4 access links (link to non-local data)5 control link (link to activation record of the called procedure)6 current parameters (usually stored in register)7 return value of the called procedure


Compile-Time Layout

Memory = blocks of Bytes, Byte = smallest addressable unit

usually: 1 Byte = 8 Bit; n Bytes = Word

Memory for a variable (or parameter) is dependant on the type.Example: Basic Types (int, real, boolean,..) = n Bytes

Storage Layout dependant on addressing:Example:

Aligned Integers may only reside at certain addresses (addresses that aredivisible by 4)

Padding 10 characters are necessary to save a string, but there must be 12Bytes allocated.

arrays, records are written to a memory range that is of sufficient size.


Memory Allocation Strategies

1 static allocation (at compile time)2 stack allocation3 heap allocation


Static Allocation

Memory mapping is determined at compile time.Local values remain stored even after procedure termination.No run time support package necessary.Limitation:

1 Size of the data structures must be known at compile time.2 Recursive procedures are only possible with restrictions (all

recursive calls share the same memory!).3 Data structures can not be created dynamically.


Stack Allocation

Idea:Control Stackprocedure is activated→ activation record is pushed to the stackprocedure activation terminates→ activation record is popped fromthe stack

local values are deleted after termination of the activation


Example

Activation Tree activation record Remarks

s sa : array

Frame for s

r

s sa : array

ri : integer

r is activated

q(1,9)r

s

sa : arrayq(1,9)

i : integer

Frame for r has be-en popped and q(1,9)pushed


Example...

Activation Tree activation record Remarks

q(1,0)p(1,3)

q(1,3)p(1,9)

q(1,9)r

s

sa : arrayq(1,9)

i : integerq(1,3)

i : integer

Control has just retur-ned to q(1,3)


Calling/Return Sequences

Calling Sequence: allocate the activation records and fill in thefieldsReturn Sequence: recover machine stateCalling sequences do not necessarily equal activation record

1 caller evaluates the current parameters2 return address, stack top are stored in the activation record of the

calling procedure (callee).3 The callee stores the register values and other status information4 The callee initializes the local data and starts the execution.


Sequences...

possible return sequence:1 the callee stores the return value2 the stack, register information,.. are restored3 the caller copies the return value into his activation record

task division:

responsibility

Callee’s

responsibility

Caller’s

record

activation

Callee’s

record

activation

Caller’s

control link

control link

temporaries and local data

links and saved status

parameters and return value

temporaries and local data

links and saved status

parameters and return value


Data with variable length

storage not directly in the activation recorda pointer to the data is stored


Dangling References

Reference to memory is used but memory has already been deallocated.

logical programming error

cause of mysterious bugs

Example:

main() {int *p;p = dangle(); }

int *dangle() {int i = 23;return &i; }


Heap Allocation

Necessary if:1 value of local variable needs to be retained (after activation)2 the called procedure survives the calling procedure

Memory is allocated and deallocated at requestMemory management is necessary:

1 linked list for storing free blocks2 free sections should be filled in an optimal way


Access to nonlocal names

lexical or static scope rules (declaration of names decided atcompile time)static scope with most closely nested scopesdynamic scope rules (declaration of names decided at run time;activations are considered)


Blocks

most closely nested:1 the scope of a declaration in block B contains B.2 If the name x in B is not declared, an occurrence of x in B is in the

scope of a declaration of x in the surrounding block B′:1 x is declared in B′.2 B′ is the closest immediate surrounding block of B which declares x.


Example – Scopes

main() {int a = 0; B0

int b = 0;{

int b = 1; B1

{int a = 2; B2

}{

int b = 3; B3

}}

}


Memory Handling

1 Memory is provided via a stack. If a block is executed, memory forthe local names is allocated. This memory will be deallocatedafter termination of the block.

2 Alternatively, it is possible to provide the memory for all blocks of aprocedure at the same time. Memory can be decided at compiletime (except if there is variable memory)


Global Data

memory space can be allocated statically

procedures can be passed as parameter (C: pointer)

Example:

program pass(input, output);var m : integer;function f(n: integer) : integer;

begin f := m + n end { f };function g(n: integer) : integer;

begin g := m * n end { g };procedure b(function h(n: integer) : integer);

begin write(h(2)) end { b };begin

m := 0;b(f); b(g); writeln

end.


Nested Procedures

procedure definitions in proceduresExample:program sort(input,output)

. . .procedure exchange(i,j:integer);. . .procedure quicksort(m,n: integer);

var k,v: integer;begin

. . .exchange(i,j);. . .


Nesting Depth, Access Lists

Nesting Depth: depth of the nesting ofprocedures/blocks. . . (programms: 1, procedure in program: 2, . . . )Access List: implementation of the access to nested procedures

access link: field in the activation record of a procedureIf P is declared in Q, the access link of P point to the access link ofQ (in the last activation of Q)


Example – Access Lists

e(1,3)

i,j

p(1,3)

k,v

q(1,3)

access link

k,v

q(1,9)

a,x

s

access link

access link

access link

access link

access link

s

a,x

q(1,9)

k,v

access link


Search for non-local names

Procedure P is in nesting depth np and accesses a with na ≤ np.1 P is being executed. The activation record of P is located at the top

of the stack. Walk along np − na access links.2 Subsequently, we reach the activation record, which contains a. An

offset value returns the actual position of a.

(nP − na, offset) defines the address of a. Computation can bedone at compile time.


Procedure Access (nested)

P calls procedure X.1 nP < nX : X lies lower than P . Therefore, X must be defined in P

(or X can not be accessed from P ). The access link of X points toP .

2 nP ≥ nX : Walk along nP − nX + 1 access links. This activationrecord contains P as well as X.


Procedure Parameter

passing a procedure as parameternot allowed in all languageshandling of links (similar to already described method):

assuming c calls b and passes f as parametera link from f to c is computedthis link is used as access link when f is actually called.


Dynamic Scope

New activation uses existing binding of non-local names in theirmemory. a in the called activation references the same memorylike the calling activation. New bindings are provided for localnames of the called procedure.Semantics of static scope and dynamic scope are different!


Example – Dynamic Scope (1)

program dynamic (input, output);var r: real;procedure show;

begin write(r : 5:3) end;procedure small;

var r : real;begin r := 0.125; show end;

beginr := 0.25;show; small; writeln;show; small; writeln;

end



show

small

dynamic



Example:Static Scope: Output = 0.250 0.250 nl 0.250 0.250 nlDynamic Scope: Output = 0.250 0.125 nl 0.250 0.125 nl

Deep access: Control links are used as access links. Search in thestack (from top to bottom) for the first entry of a non-local name.Shallow access: Current value of a name is deposited in a(statically) allocated location. At an activation of P , local name nuses the location. The old value of the location can be cached inthe activation record and is therefore restorable.


Parameter Passing

How are parameters passed at a call?

procedure exchange(i,j: integer);var x: integer;begin

x := a[i]; a[i] := a[j]; a[j] := xend

Call-by-ValueCall-by-ReferenceCopy-RestoreCall-by-Name


Call-by-Value

Formal parameters are considered as local names.The caller evaluates the actual parameter and passes them theassociated formal parameters.Pointers can also be passed as values.


Call-by-Reference

Instead of the value (as in Call-by-Value) a pointer to the memorylocation of the actual parameter is passed.var parameter in PASCAL are references.Arrays are usually passed as reference.


Copy-Restore

Hybrid method between Call-by-Value and Call-by-Reference.Method:

1 Before executing the procedure, the actual parameters areevaluated. The R-Values (values) are passed to the respectiveformal parameters (as Call-By-Value). Additionally, the L-Values(locations) are computed.

2 After procedure termination, the actual R-Values are copied back tothe L-Values of the actual parameters (if available).


Call-by-Name

Procedures are treated as macros, i.e. instead of the procedurecall, the body of the procedure is substituted; all formalparameters in the body are replaced by the actual parameters.(Macro-Expansion)The local names of the called procedures must be different to thename of the calling procedure. (Variable renaming is partiallynecessary)The actual parameters are put in braces to avoid problems.Problems: Call swap(i,a[i]) is expanded to:temp := i; i := a[i]; a[i] := tempInstead of a[i]=i we write a[a[i]]=i. temp := x; x := y;y := temp


Symbol Table

entries correspond to the declaration of namesstore binding and scope informationstorage allocation information (that is needed at run time)storage of the symbol table in a liststorage of the symbol table in a hash table


PART 7 - INTERMEDIATE CODE GENERATION


Objectives/Tasks

provide a target machine independent formatadvantages:

easy adaption on different target machinesmachine independent code optimization can be realized

attributed grammars can be used


Languages

graphical representation (syntax tree)Three-Address Codex := y op z

x,y,z are arbitrary numbers, constants, names (variables), ortemporary variablesop is an operator


Three-Address Code Statements

1 assignments x := y op t

2 assignments with unary operator x := op y

3 copy statements x := y

4 unconditional jumps goto L

5 conditional jumps if x relop y goto L

6 param x,call p, n calls procedure p with n parameters.return y where y is optional.

7 assignments with indices: x := y[i] und x[i]:=y.8 addresses and pointer assignments: x := & y, x := *y and *x:= y


Generating the Three-Address Code (simplified)

S-attributed grammarS.code represents the three address codeE.place the name which contains the value of the non-terminal E.E.code the sequence of three address code statements, thatevaluate E.


Assignments

S → id := E S.code := E.code||gen( id.place ’:=’ E.place)E → E1 + E2 E.place := newtemp;

E.code := E1.code||E2.code||gen(E.place ’:=’ E1.place ’+’ E2.place)E → E1 * E2 E.place := newtemp;

E.code := E1.code||E2.code||gen(E.place ’:=’ E1.place ’*’ E2.place)E → - E1 E.place := newtemp;

E.code := E1.code||gen(E.place ’:=’ ’uminus’ E1.place)E → ( E1 ) E.place := E1.place;E.code := E1.codeE → id E.place := id .place;E.code := ”


If-then-else

Statement: S → if E then S1 else S2Generated Code:S.else := newlabel;S.after := newlabel;

S.code :=

E.code||gen( ’id’ E.place ’=’ ’0’ ’goto’ S.else)||S1.code||gen( ’goto’ S.after)||gen(S.else ’:’ )||S2.code||gen(S.after ’:’ )


While-loops

Statement: S → while E do S1

Generated Code:S.begin := newlabel;S.after := newlabel;

S.code :=

gen(S.begin ’:’)||E.code||gen( ’if’ E.place ’=’ ’0’ ’goto’ S.after)||S1.code||gen( ’goto’ S.begin)||gen(S.after ’:’)


Implementation of the Three-Address Code

Quadruple = record with 4 fields:

op Operatorarg1 1. Argumentarg2 2. Argumentresult Temp. Variable

arguments and temporary variables are usually pointer to symboltable entriesTriple = “Quadruple” without result field. instead, the position ofthe triple that calculates a value is stored in the argument.


Examples

a := b * -c + b * -cQuadruple

op arg1 arg2 result

(0) uminus c t1(1) * b t1 t2(2) uminus c t3(3) * b t3 t4(4) + t2 t4 t5(5) := t5 a

Tripleop arg1 arg2

(0) uminus c(1) * b (0)(2) uminus c(3) * b (2)(4) + (1) (3)(5) := a (4)


Declarations

provision of the memory space for local names of a procedure(relative addresses of the activation record or memory of the staticdata area)procedure declarations:

1 offset . . . next free relative address2 initialization offset = 03 offset is used for the current data object4 then, the offset is increased by the size of the current data object

enter(name, type, offset) creates an entry in the symbol table forname, assigns it the type type and offset as relative address.


Attributed Grammar for Declarations

P → {offset := 0}DD → D ; DD → id : T{enter( id.name, T.type, offset)

offset := offset+ T.width}T → integer{T.type := integer;T.width := 4}T → real{T.type := real;T.width := 8}T → array [ num ] of T1{T.type := array(num.val, T1.type);

T.width := num.val× T1.width}T →↑ T1{T.type := pointer(T1.type);T.width := 4}


Scope Information

each procedure has its own symbol tablefor every procedure declaration a symbol table is created+ a link to the symbol table of the enclosing procedureoffset is now local!example grammar:P → DD → D ; D| id : T | proc id ; D ; S

grammar definition. . . (Exercise)


Records, Field Names

create symbol table for fields (marker L).names are stored in the new symbol tablegrammar:T → record LD end {T.type := record(top(tblptr));

T.width := top(offset)pop(tblptr); pop(offset)}

L→ ε {t := maketable(nil);push(t, tblptr); push(0, offset)}


Assignments

assumption so far: names are represented by oneselfcorrect if name for pointer is in their symbol tablegeneralization by attribute: name for identifierlookup(id.name) results in the entryadvantage: usable even if the entry is declared in an enclosingprocedureBy defining lookup, the scope of a language regarding anidentifiers is defined.


Grammar

S → id := E p := lookup(id.name);if p 6= nil then

emit(p′ :=′ E.place)else error

E → E1 + E2 E.place := newtemp;emit(E.place′ :=′ E1.place′ +′ E2.place)

E → E1 * E2 E.place := newtemp;emit(E.place′ :=′ E1.place′ ∗′ E2.place)

E → - E1 E.place := newtemp;emit(E.place′ :=′ ′uminus′E1.place)

E → ( E1 ) E.place := E1.placeE.place := newtemp

E → id p := lookup( id.name)if p 6= nil then

E.place := pelse error


Addressing Array Elements

array access fast, if the elements are stored in one block.

access to the element at position i (w . . . element size):

base+ (i− low)× w

can be rewritten to:i× w + (base− low × w)

advantage: base− low × w = c can be calculated at compile time!

two dimensional arrays (A(i1, i2)):

row major (row-by-row) base+ ((i1 − low1)× n2 + i2 − low2)× wwhere n2 = high2 − low2 + 1.column major (column-by-column)


Translation Schema for Array Access (1)

1 S → L := E

if L.offset = null thenemit(L.place′ :=′ E.place);

elseemit(L.place ’[’ L.offset ’]’ ’:=’ E.place)

2 E → E1 + E2

E.place := newtempemit(E.place ’:=’ E1.place ’+’ E2.place)

3 E → ( E1 )E.place := E1.place


Translation Schema (2)

1 E → L

if L.offset = null thenE.place := L.place

elseE.place := newtempemit(E.place ’:=’ L.place ’[’ L.offset ’]’)

2 L→ Elist ]L.place := newtempL.offset := newtempemit(L.place ’:=’ c(Elist.array))emit(L.offset ’:=’ Elist.place ’*’ width(Elist.array))

3 L→ idL.place := id.place;L.offset := null


Translation Schema (3)

1 Elist→ Elist1, E

t := newtempm := Elist1.ndim+ 1emit(t ’:=’ Elist1.place ’*’ limit(Elist1.array,m))emit(t ’:=’ t ’+’ E.place)Elist.array := Elist1.arrayElist.ndim := m

2 Elist→ id [ EElist.array := id.placeElist.place := E.placeElist.ndim := 1


Boolean Expressions

2 main tasks:1 calculating logical values2 changing program procedure

grammar:

E → E or E|E and E| not E| ( E ) | id relop id |true|false

2 methods to represent boolean values:1 true and false are coded as numbers (e.g. true = 1, false = 0).2 Flow-of-Control: values are represented as positions in the code


Numeric Representation

Example 1: a or (b and (not c))

t1 := not ct2 := b and t1t3 := a or t2

Example 2: a < b

100: if a < b then goto 103101: t := 0102: goto 104103: t := 1104:


Translation Schema (Bool. Expr.) I

E → E1 or E2 E.place := newtemp; emit(E.place ’:=’ E1.place ’or’ E2.placeE → E1 and E2 E.place := newtemp; emit(E.place ’:=’ E1.place ’and’ E2.placeE → not E1 E.place := newtemp; emit(E.place ’:=’ ’not’ E1.placeE → id1 relop id2 E.place := newtemp

emit( ’if’ id1.place relop.opid2.place ’goto’ nextstat+ 3)emit(E.place ’:=’ ’0’emit(’goto’ nextstat+ 2)emit(E.place ’:=’ ’1’

E → true E.place = newtemp; emit(E.place ’:=’ ’1’)E → false E.place = newtemp; emit(E.place ’:=’ ’0’)


Short-Circuit Code

Representation of the boolean expressions without generating code foroperators and, or, not

values are represented by positions in the code

Jumping Code

Example: a < b or c < d and e < f

100: if a < b goto 103101*: t1 := 0102: goto 104103*: t1 := 1104: if c < d goto 107105*: t2 := 0106: goto 108

107*: t2 := 1108: if e < f goto 111109*: t3 := 0110: goto 112111*: t3 := 1112*: t4 := t2 and t3113*: t5 := t1 or t4


Flow-of-Control Statements

Statements:

S →if E then S1if E then S1 else S2while E do S1

use labels to represent true and falsedependent on the evaluation of E, branch out.attributed grammar (see above)


Control-Flow Translation

Use E.true (E.false) if E evaluates to true.Example E1 or E2 is true, if E1 is true.not all expressions are evaluated (like e.g. in C)Example: a < b or (c < d and e < f)

if a < b goto Ltruegoto L1

L1: if c < d goto L2goto Lfalse

L2: if e < f goto Ltruegoto Lfalse


Translation Schema (Bool. Expr.) II

E → E1 or E2 E1.true := E.true;E1.false := newlabel;E2.true := E.trueE2.false := E.false;E.code := E1.code||gen(E1.false’:’ ||E2.code

E → E1 and E2 E1.true := newlabel;E1.false := E.false;E2.true := E.trueE2.false := E.false;E.code := E1.code||gen(E1.true’:’ ||E2.code

E → not E1 E1.true := E.false;E1.false := E.true;E.code := E1.code

E → id1 relop id2 E.code :=

gen

(’if’ id1.placerelop.opid2.place ’goto’ E.true

)||

gen(’goto’ E.false)E → true E.code = ’goto’ E.trueE → false E.code = ’goto’ E.false


Mixed Mode

consideration (so far) simplifiedin practice, mixed expressions are possibleExample 1: (a + b) < c

Example 2: (a < b) + (b < a)

introduce synthetic attribute E.type

E.type =

{arith Arithmetic expressionbool Boolean expression

Code Generation for E + E, E ∗ E, . . . needs to be changed.


Back patching

easiest implementation of attributed grammars:1 generate a syntax tree2 generate the translation depth-first

problem with Single Pass:labels for control flow are unknown

trouble-shooting:1 jump statements are generated with empty labels2 these statements are saved in a list3 the target labels are registered once they are known


Procedure Calls

usage of run time routines for handling the parameters, the callitself and the return of values.grammar:

S → call id (Elist)Elist→ Elist, E|E

Calling Sequence must be reproduced.


Attributed Grammar (simplified)

Call-by-ReferenceMemory is statically allocatedgrammar:

1 S → call id (Elist)

for each item p on queue doemit(’param’ p)

emit(’call’ id.place)2 Elist→ Elist, E

Append E.place to the end of queue3 Elist→ E

Initialize queue to contain only E.place


PART 4 - SYNTAX DIRECTED · PDF filePART 4 - SYNTAX DIRECTED TRANSLATION F. Wotawa ......

Documents

Transcript of PART 4 - SYNTAX DIRECTED · PDF filePART 4 - SYNTAX DIRECTED TRANSLATION F. Wotawa ......