Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

43
CIS 341: COMPILERS Lecture 15

Transcript of Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Page 1: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

CIS 341: COMPILERSLecture 15

Page 2: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Announcements

• COVID19 Logistics– Changed due dates for remaining projects– Final exam still TBD: I will follow the university guidelines

• my preference would be to allow a “take home” complete in your own time “homework-style” exam.

– The university allows you to choose to take the class Pass/Fail• See Piazza post for ongoing discussion

• Zoom recordings uploaded in the afternoon after the lecture.

• HW4: OAT v. 1.0– Parsing & basic code generation– Due: Monday, March 30th

• HW5: now (tentatively) due: Friday, April 17th

• HW6: now (tentatively) due: Wednesday, April 29th

Zdancewic CIS 341: Compilers 2

Page 3: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Midterm Results

Zdancewic CIS 341: Compilers 3

(Out of 80 points)Median: 60Mean: 58.2Std.Dev: 11.3

Midterm ExamGrades Available on GradescopePlease ask for regrade requests no later than Tuesday, March 31st

Page 4: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

FIRST-CLASS FUNCTIONS

Zdancewic CIS 341: Compilers 4

Untyped lambda calculusSubstitutionEvaluation

Page 5: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

“Functional” languages• Languages like ML, Haskell, Scheme, Python, C#, Java 8, Swift• Functions can be passed as arguments (e.g. map or fold)• Functions can be returned as values (e.g. compose)• Functions nest: inner function can refer to variables bound in the outer

function

let add = fun x -> fun y -> x + ylet inc = add 1let dec = add -1

let compose = fun f -> fun g -> fun x -> f (g x)let id = compose inc dec

• How do we implement such functions?– in an interpreter? in a compiled language?

CIS 341: Compilers 5

Page 6: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

(Untyped) Lambda Calculus• The lambda calculus is a minimal programming language.

– Note: we’re writing (fun x -> e) lambda-calculus notation: l x. e

• It has variables, functions, and function application.– That’s it! – It’s Turing Complete.– It’s the foundation for a lot of research in programming languages.– Basis for “functional” languages like Scheme, ML, Haskell, etc.

Abstract syntax in OCaml:

Concrete syntax:

CIS 341: Compilers 6

type exp = | Var of var (* variables *)| Fun of var * exp (* functions: fun x → e *)| App of exp * exp (* function application *)

exp ::= | x variables| fun x → exp functions| exp1 exp2 function application| ( exp ) parentheses

Page 7: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Values and Substitution• The only values of the lambda calculus are (closed) functions:

• To substitute a (closed) value v for some variable x in an expression e– Replace all free occurrences of x in e by v.– In OCaml: written subst v x e– In Math: written e{v/x}

• Function application is interpreted by substitution:(fun x → fun y → x + y) 1

= subst 1 x (fun y → x + y)= (fun y → 1 + y)

CIS 341: Compilers 7

val ::= | fun x → exp functions are values

Note: for the sake of examples we mayadd integers andarithmetic operations tothe “pure” untyped lambda calculus.

Page 8: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Lambda Calculus Operational Semantics• Substitution function (in Math):

x{v/x} = v (replace the free x by v)y{v/x} = y (assuming y ≠ x)

(fun x → exp){v/x} = (fun x → exp) (x is bound in exp)(fun y → exp){v/x} = (fun y → exp{v/x}) (assuming y ≠ x)

(e1 e2){v/x} = (e1{v/x} e2{v/x}) (substitute everywhere)

• Examples:(x y) {(fun z → z z)/y} = x (fun z → z z)

(fun x → x y){(fun z → z z)/y} = fun x → x (fun z → z z)

(fun x → x){(fun z → z z)/x} = fun x → x // x is not free!

Zdancewic CIS 341: Compilers 8

Page 9: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Free Variables and Scopinglet add = fun x → fun y → x + ylet inc = add 1

• The result of add 1 is a function• After calling add, we can’t throw away its argument (or its local

variables) because those are needed in the function returned by add.• We say that the variable x is free in fun y → x + y

– Free variables are defined in an outer scope

• We say that the variable y is bound by “fun y” and its scope is the body “x + y” in the expression fun y → x + y

• A term with no free variables is called closed.• A term with one or more free variables is called open.

CIS 341: Compilers 9

Page 10: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Free Variable Calculation• An OCaml function to calculate the set of free variables in a lambda

expression:

• A lambda expression e is closed if free_vars e returns VarSet.empty

• In mathematical notation:

fv(x) = {x}fv(fun x -> exp) = fv(exp) \ {x} (‘x’ is a bound in exp)fv(exp1 exp2) = fv(exp1) ∪ fv(exp2)

Zdancewic CIS 341: Compilers 10

let rec free_vars (e:exp) : VarSet.t =begin match e with

| Var x -> VarSet.singleton x| Fun(x, body) -> VarSet.remove x (free_vars body)| App(e1, e2) -> VarSet.union (free_vars e1) (free_vars e2)

end

Page 11: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Variable Capture• Note that if we try to naively "substitute" an open term, a bound

variable might capture the free variables:

(fun x → (x y)){(fun z → x)/y}= fun x → (x (fun z -> x))

• Usually not the desired behavior– This property is sometimes called "dynamic scoping"

The meaning of "x" is determined by where it is bound dynamically,not where it is bound statically.

– Some languages (e.g. emacs lisp) are implemented with this as a "feature"– But: it leads to hard-to-debug scoping issues

Zdancewic CIS 341: Compilers 11

Note: x is freein (fun z → x)

free x is“captured”!!

Page 12: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Alpha Equivalence• Note that the names of bound variables don't matter to the semantics

– i.e. it doesn't matter which variable names you use, as long as you use them consistently:

(fun x → y x) is the "same" as (fun z → y z)the choice of "x" or "z" is arbitrary, so long as we consistently

rename them

• The names of free variables do matter:(fun x → y x) is not the "same" as (fun x → z x)

Intuitively: y an z can refer to different things from some outer scope

Zdancewic CIS 341: Compilers 12

Two terms that differ only by consistent renaming of bound variables are called alpha equivalent

Students who cheat by “renaming variables” are trying to exploit alpha equivalence…

Page 13: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Fixing Substitution• Consider the substitution operation:

{e2/x} e1

• To avoid capture, we define substitution to pick an alpha equivalent version of e1 such that the bound names of e1 don't mention the free names of e2.– Then do the "naïve" substitution.

For example: (fun x → (x y)){(fun z → x)/y} = (fun x’ → (x' (fun z → x))

Zdancewic CIS 341: Compilers 13

rename x to x'

Page 14: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Operational Semantics• Specified using just two inference rules with judgments of the form

exp ⇓ val– Read this notation a as “program exp evaluates to value val”– This is call-by-value semantics: function arguments are evaluated before

substitution

Zdancewic CIS 341: Compilers 14

v ⇓ v

exp1 ⇓ (fun x → exp3) exp2 ⇓ v exp3{v/x} ⇓ w

exp1 exp2 ⇓ w

“Values evaluate to themselves”

“To evaluate function application: Evaluate the function to a value, evaluate theargument to a value, and then substitute the argument for the function. ”

Page 15: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

IMPLEMENTING THE INTERPRETER

Zdancewic CIS 341: Compilers 15

See fun.ml

Page 16: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Adding Integers to Lambda Calculus

Zdancewic CIS 341: Compilers 16

exp1 ⇓ n1 exp2 ⇓ n2

exp1 + exp2 ⇓ (n1 ⟦+⟧ n2)

exp ::= | …| n constant integers| exp1 + exp2 binary arithmetic operation

val ::= | fun x → exp functions are values| n integers are values

n{v/x} = n constants have no free vars.(e1 + e2){v/x} = (e1{v/x} + e2{v/x}) substitute everywhere

Object-level ‘+’ Meta-level ‘+’

Page 17: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

STATIC ANALYSIS

Zdancewic CIS 341: Compilers 17

Scope, Types, and Context

Page 18: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Variable Scoping• Consider the problem of determining whether a programmer-declared

variable is in scope.

• Issues:– Which variables are available at a given point in the program?– Shadowing – is it permissible to re-use the same identifier, or is it an error?

• Example: The following program is syntactically correct but not well-formed. (y and q are used without being defined anywhere)

Zdancewic CIS 341: Compilers 18

int fact(int x) {var acc = 1;while (x > 0) {

acc = acc * y;x = q - 1;

}return acc;

}

Q: Can we solve this problem by changing the parser to ruleout such programs?

Page 19: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Contexts and Inference Rules• Need to keep track of contextual information.

– What variables are in scope?– What are their types?

• How do we describe this?– In the compiler there's a mapping from variables to information we know

about them.

Zdancewic CIS 341: Compilers 19

Page 20: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Why Inference Rules?• They are a compact, precise way of specifying language properties.

– E.g. ~20 pages for full Java vs. 100’s of pages of prose Java Language Spec.

• Inference rules correspond closely to the recursive AST traversal that implements them

• Type checking (and type inference) is nothing more than attempting to prove a different judgment ( G;L ⊢ e : t ) by searching backwards through the rules.

• Compiling in a context is nothing more than a collection of inference rules specifying yet a different judgment ( G ⊢ src ⇒ target )– Moreover, the compilation judgment is similar to the typechecking judgment

• Strong mathematical foundations– The “Curry-Howard correspondence”: Programming Language ~ Logic,

Program ~ Proof, Type ~ Proposition– See CIS 500 next Fall if you’re interested in type systems!

CIS 341: Compilers 20

Page 21: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Inference Rules• We can read a judgment G;L ⊢ e : t as

“the expression e is well typed and has type t”• For any environment G, expression e, and statements s1, s2.

G;L;rt ⊢ if (e) s1 else s2

holds if G ;L⊢ e : bool and G;L;rt ⊢ s1 and G;L;rt ⊢ s2all hold.

• More succinctly: we summarize these constraints as an inference rule:

• This rule can be used for any substitution of the syntactic metavariables G, e, s1 and s2.

CIS 341: Compilers 21

G;L ⊢ e : bool G;L;rt ⊢ s1 G;L;rt ⊢ s2

G;L;rt ⊢ if (e) s1 else s2

Premises

Conclusion

Page 22: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Checking Derivations• A derivation or proof tree has (instances of) judgments as its nodes and

edges that connect premises to a conclusion according to an inference rule.

• Leaves of the tree are axioms (i.e. rules with no premises)– Example: the INT rule is an axiom

• Goal of the type checker: verify that such a tree exists.• Example1: Find a tree for the following program using the inference

rules in oat0-defn.pdf:

Example2: There is no tree for this ill-scoped program:

CIS 341: Compilers 22

var x1 = 0;var x2 = x1 + x1;x1 = x1 – x2;return(x1);

var x2 = x1 + x1;return(x2);

Page 23: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Example Derivation

Zdancewic CIS 341: Compilers 23

var x1 = 0;var x2 = x1 + x1;x1 = x1 – x2;return(x1);

Page 24: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Example Derivation

Zdancewic CIS 341: Compilers 24

Page 25: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Example Derivation

Zdancewic CIS 341: Compilers 25

Oat v.0 Example Derivation

Steve Zdancewic

March 3, 2015

D1 D2 D3 D4

G0; · ;int ` int x1 = 0; int x2 = x1 + x1; x1 = x1 - x2; return x1; ) ·, x1 :int, x2 :int[stmts]

` int x1 = 0; int x2 = x1 + x1; x1 = x1 - x2; return x1;[prog]

D1 =

G0;· ` 0 : int[int]

G0;· ` 0 : int[const]

G0;· ` int x1 = 0 ) ·, x1 :int[decl]

G0; · ;int ` int x1 = 0; ) ·, x1 :int[sdecl]

D2 =

` + : (int, int) ! int[add]

x1 :int 2 ·, x1 :intG0;·, x1 :int ` x1 : int

[var]x1 :int 2 ·, x1 :int

G0;·, x1 :int ` x1 : int[var]

G0;·, x1 :int ` x1 + x1 : int[bop]

G0;·, x1 :int;int ` int x2 = x1 + x1; ) ·, x1 :int, x2 :int[decl]

G0;·, x1 :int;int ` int x2 = x1 + x1; ) ·, x1 :int, x2 :int[sdecl]

D3 =

x1 :int 2 ·, x1 :int, x2 :int` - : (int, int) ! int

[add]x1 :int 2 ·, x1 :int, x2 :int

G0;·, x1 :int, x2 :int ` x1 : int[var]

x2 :int 2 ·, x1 :int, x2 :intG0;·, x1 :int, x2 :int ` x2 : int

[var]

G0;·, x1 :int, x2 :int ` x1 - x2 : int[bop]

G0;·, x1 :int, x2 :int;int ` x1 = x1 - x2; ) ·, x1 :int, x2 :int[assn]

D4 =

x1 :int 2 ·, x1 :int, x2 :intG0;·, x1 :int, x2 :int ` x1 : int

[var]

G0;·, x1 :int, x2 :int;int ` return x1; ) ·, x1 :int, x2 :int[Ret]

1

Page 26: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Why Inference Rules?• They are a compact, precise way of specifying language properties.

– E.g. ~20 pages for full Java vs. 100’s of pages of prose Java Language Spec.

• Inference rules correspond closely to the recursive AST traversal that implements them

• Compiling in a context is nothing more an “interpretation” of the inference rules that specify typechecking*: ⟦C ⊢ e : t⟧– Compilation follows the typechecking judgment

• Strong mathematical foundations– The “Curry-Howard correspondence”: Programming Language ~ Logic,

Program ~ Proof, Type ~ Proposition– See CIS 500 next Fall if you’re interested in type systems!

CIS 341: Compilers 26*Here (and later) we’ll write context C for G;L, the combination of theglobal and local contexts.

Page 27: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Compilation As Translating Judgments• Consider the source typing judgment for source expressions:

C ⊢ e : t

• How do we interpret this information in the target language?⟦C ⊢ e : t⟧ = ?

• ⟦t⟧ is a target type• ⟦e⟧ translates to a (potentially empty) sequence of instructions, that,

when run, computes the result into some operand

• INVARIANT: if ⟦C ⊢ e : t ⟧ = ty, operand , stream then the type (at the target level) of the operand is ty=⟦t⟧

Zdancewic CIS 341: Compilers 27

Page 28: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Example• C ⊢ 341 + 5 : int what is ⟦ C ⊢ 341 + 5 : int⟧ ?

⟦ ⊢ 341 : int ⟧ = (i64, Const 341, []) ⟦⊢ 5 : int⟧ = (i64, Const 5, [])

---------------------------------------- ---------------------------------------⟦C ⊢ 341 : int⟧ = (i64, Const 341, []) ⟦C ⊢ 5 : int⟧ = (i64, Const 5, [])

------------------------------------------------------------------------------------------⟦C ⊢ 341 + 5 : int⟧ = (i64, %tmp, [%tmp = add i64 (Const 341) (Const 5)])

Zdancewic CIS 341: Compilers 28

Page 29: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

What about the Context?• What is ⟦C⟧?• Source level C has bindings like: x:int, y:bool

– We think of it as a finite map from identifiers to types

• What is the interpretation of C at the target level?

• ⟦C⟧ maps source identifiers, “x” to source types and ⟦x⟧

• What is the interpretation of a variable ⟦x⟧ at the target level?– How are the variables used in the type system?

Zdancewic CIS 341: Compilers 29

` bop : ft

` + : (int, int) ! inttyp_add

` ? : (int, int) ! inttyp_mul

` - : (int, int) ! inttyp_sub

G;L ` exp : t

` const : tG;L ` const : t

typ_const

x : t 2 LG;L ` x : t

typ_var

` bop : (t1, t2) ! t G;L ` exp1 : t1 G;L ` exp2 : t2

G;L ` exp1 bop exp2 : ttyp_bop

f : (t1, .. , ti) ! t 2 G G;L ` exp1 : t1 .. G;L ` expi : ti

G;L ` f (exp1, .. , expi) : ttyp_ecall

G;L1 ` decl ) L2

G;L ` exp : tG;L ` t x = exp ) L, x : t

typ_decl

G;L1;rt ` stmt ) L2

G;L1 ` decl ) L2

G;L1;rt ` decl; ) L2

typ_sdecl

x : t 2 L G;L ` exp : tG;L;rt ` x = exp; ) L

typ_assn

f : (t1, .. , ti) ! void 2 G G;L ` exp1 : t1 .. G;L ` expi : ti

G;L;rt ` f (exp1, .. , expi); ) Ltyp_scall

G;L ` exp : int G;L;rt ` block1 G;L;rt ` block2

G;L;rt ` if(exp) block1 else block2 ) Ltyp_if

G;L ` exp : int G;L;rt ` blockG;L;rt ` while(exp) block ) L

typ_while

G;L ` exp : tG;L;t ` return exp; ) L

typ_retT

G;L;void ` return ; ) Ltyp_retVoid

G;L;rt ` block

G;L0;rt ` stmt1 .. stmti ) Li

G;L0;rt ` {stmt1 .. stmti}typ_block

4

as expressions (which denote values)

` bop : ft

` + : (int, int) ! inttyp_add

` ? : (int, int) ! inttyp_mul

` - : (int, int) ! inttyp_sub

G;L ` exp : t

` const : tG;L ` const : t

typ_const

x : t 2 LG;L ` x : t

typ_var

` bop : (t1, t2) ! t G;L ` exp1 : t1 G;L ` exp2 : t2

G;L ` exp1 bop exp2 : ttyp_bop

f : (t1, .. , ti) ! t 2 G G;L ` exp1 : t1 .. G;L ` expi : ti

G;L ` f (exp1, .. , expi) : ttyp_ecall

G;L1 ` decl ) L2

G;L ` exp : tG;L ` t x = exp ) L, x : t

typ_decl

G;L1;rt ` stmt ) L2

G;L1 ` decl ) L2

G;L1;rt ` decl; ) L2

typ_sdecl

x : t 2 L G;L ` exp : tG;L;rt ` x = exp; ) L

typ_assn

f : (t1, .. , ti) ! void 2 G G;L ` exp1 : t1 .. G;L ` expi : ti

G;L;rt ` f (exp1, .. , expi); ) Ltyp_scall

G;L ` exp : int G;L;rt ` block1 G;L;rt ` block2

G;L;rt ` if(exp) block1 else block2 ) Ltyp_if

G;L ` exp : int G;L;rt ` blockG;L;rt ` while(exp) block ) L

typ_while

G;L ` exp : tG;L;t ` return exp; ) L

typ_retT

G;L;void ` return ; ) Ltyp_retVoid

G;L;rt ` block

G;L0;rt ` stmt1 .. stmti ) Li

G;L0;rt ` {stmt1 .. stmti}typ_block

4

as addresses (which can be assigned)

Page 30: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Interpretation of Contexts• ⟦C⟧ = a map from source identifiers to types and target identifiers

• INVARIANT:x:t ∈ C means that

(1) lookup ⟦C⟧ x = (t, %id_x) (2) the (target) type of %id_x is ⟦t⟧* (a pointer to ⟦t⟧)

Zdancewic CIS 341: Compilers 30

Page 31: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Interpretation of Variables• Establish invariant for expressions:

= (%tmp, [%tmp = load i64* %id_x])

where (i64, %id_x) = lookup ⟦L⟧ x

• What about statements?

= stream @ [store ⟦t⟧ opn, ⟦t⟧* %id_x]

where (t, %id_x) = lookup ⟦L⟧ xand ⟦G;L ⊢ exp : t⟧ = (⟦t⟧, opn, stream)

Zdancewic CIS 341: Compilers 31

` bop : ft

` + : (int, int) ! inttyp_add

` ? : (int, int) ! inttyp_mul

` - : (int, int) ! inttyp_sub

G;L ` exp : t

` const : tG;L ` const : t

typ_const

x : t 2 LG;L ` x : t

typ_var

` bop : (t1, t2) ! t G;L ` exp1 : t1 G;L ` exp2 : t2

G;L ` exp1 bop exp2 : ttyp_bop

f : (t1, .. , ti) ! t 2 G G;L ` exp1 : t1 .. G;L ` expi : ti

G;L ` f (exp1, .. , expi) : ttyp_ecall

G;L1 ` decl ) L2

G;L ` exp : tG;L ` t x = exp ) L, x : t

typ_decl

G;L1;rt ` stmt ) L2

G;L1 ` decl ) L2

G;L1;rt ` decl; ) L2

typ_sdecl

x : t 2 L G;L ` exp : tG;L;rt ` x = exp; ) L

typ_assn

f : (t1, .. , ti) ! void 2 G G;L ` exp1 : t1 .. G;L ` expi : ti

G;L;rt ` f (exp1, .. , expi); ) Ltyp_scall

G;L ` exp : int G;L;rt ` block1 G;L;rt ` block2

G;L;rt ` if(exp) block1 else block2 ) Ltyp_if

G;L ` exp : int G;L;rt ` blockG;L;rt ` while(exp) block ) L

typ_while

G;L ` exp : tG;L;t ` return exp; ) L

typ_retT

G;L;void ` return ; ) Ltyp_retVoid

G;L;rt ` block

G;L0;rt ` stmt1 .. stmti ) Li

G;L0;rt ` {stmt1 .. stmti}typ_block

4

as expressions (which denote values)

` bop : ft

` + : (int, int) ! inttyp_add

` ? : (int, int) ! inttyp_mul

` - : (int, int) ! inttyp_sub

G;L ` exp : t

` const : tG;L ` const : t

typ_const

x : t 2 LG;L ` x : t

typ_var

` bop : (t1, t2) ! t G;L ` exp1 : t1 G;L ` exp2 : t2

G;L ` exp1 bop exp2 : ttyp_bop

f : (t1, .. , ti) ! t 2 G G;L ` exp1 : t1 .. G;L ` expi : ti

G;L ` f (exp1, .. , expi) : ttyp_ecall

G;L1 ` decl ) L2

G;L ` exp : tG;L ` t x = exp ) L, x : t

typ_decl

G;L1;rt ` stmt ) L2

G;L1 ` decl ) L2

G;L1;rt ` decl; ) L2

typ_sdecl

x : t 2 L G;L ` exp : tG;L;rt ` x = exp; ) L

typ_assn

f : (t1, .. , ti) ! void 2 G G;L ` exp1 : t1 .. G;L ` expi : ti

G;L;rt ` f (exp1, .. , expi); ) Ltyp_scall

G;L ` exp : int G;L;rt ` block1 G;L;rt ` block2

G;L;rt ` if(exp) block1 else block2 ) Ltyp_if

G;L ` exp : int G;L;rt ` blockG;L;rt ` while(exp) block ) L

typ_while

G;L ` exp : tG;L;t ` return exp; ) L

typ_retT

G;L;void ` return ; ) Ltyp_retVoid

G;L;rt ` block

G;L0;rt ` stmt1 .. stmti ) Li

G;L0;rt ` {stmt1 .. stmti}typ_block

4

as addresses (which can be assigned)

Page 32: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Other Judgments?• Statement:

⟦C; rt ⊢ stmt ⇒ C’⟧ = ⟦C’⟧ , stream

• Declaration:⟦G;L ⊢ t x = exp ⇒ G;L,x:t ⟧ = ⟦G;L,x:t⟧, stream

INVARIANT: stream is of the form:stream’ @[ %id_x = alloca ⟦t⟧;

store ⟦t⟧ opn, ⟦t⟧* %id_x ]

and ⟦G;L ⊢ exp : t ⟧ = (⟦t⟧, opn, stream’)

• Rest follow similarly

Zdancewic CIS 341: Compilers 32

Page 33: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

COMPILING CONTROL

Zdancewic CIS 341: Compilers 33

Page 34: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Translating while• Consider translating “while(e) s”:

– Test the conditional, if true jump to the body, else jump to the label after the body.

⟦C;rt ⊢ while(e) s ⇒ C’⟧ = ⟦C’⟧,

• Note: writing opn = ⟦C ⊢ e : bool⟧ is pun– translating ⟦C ⊢ e : bool⟧ generates code that puts the result into opn– In this notation there is implicit collection of the code

CIS 341: Compilers 34

lpre:opn = ⟦C ⊢ e : bool⟧ %test = icmp eq i1 opn, 0br %test, label %lpost, label %lbody

lbody:⟦C;rt ⊢ s ⇒ C’⟧br %lpre

lpost:

Page 35: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Translating if-then-else• Similar to while except that code is slightly more complicated because

if-then-else must reach a merge and the else branch is optional.

⟦C;rt ⊢ if (e1) s1 else s2 ⇒ C’⟧ = ⟦C’⟧

CIS 341: Compilers 35

opn = ⟦C ⊢ e : bool⟧%test = icmp eq i1 opn, 0br %test, label %else, label %then

then:⟦C;rt ⊢ s1 ⇒ C’⟧br %merge

else:⟦C; rt s2 ⇒ C’⟧br %merge

merge:

Page 36: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Connecting this to Code• Instruction streams:

– Must include labels, terminators, and “hoisted” global constants

• Must post-process the stream into a control-flow-graph

• See frontend.ml from HW4

Zdancewic CIS 341: Compilers 36

Page 37: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

OPTIMIZING CONTROL

Zdancewic CIS 341: Compilers 37

Page 38: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Standard Evaluation• Consider compiling the following program fragment:

if (x & !y | !w) z = 3;

else z = 4;

return z;

CIS 341: Compilers 38

%tmp1 = icmp Eq ⟦y⟧, 0 ; !y%tmp2 = and ⟦x⟧ ⟦tmp1⟧%tmp3 = icmp Eq ⟦w⟧, 0%tmp4 = or %tmp2, %tmp3%tmp5 = icmp Eq %tmp4, 0br %tmp4, label %else, label %then

then:store ⟦z⟧, 3br %merge

else:store ⟦z⟧, 4br %merge

merge:%tmp5 = load ⟦z⟧ret %tmp5

Page 39: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Observation• Usually, we want the translation ⟦e⟧ to produce a value

– ⟦C ⊢ e : t⟧ = (ty, operand, stream)– e.g. ⟦C ⊢ e1 + e2 : int⟧ = (i64, %tmp, [%tmp = add ⟦e1⟧ ⟦e2⟧])

• But when the expression we’re compiling appears in a test, the program jumps to one label or another after the comparison but otherwise never uses the value.

• In many cases, we can avoid “materializing” the value (i.e. storing it in a temporary) and thus produce better code.– This idea also lets us implement different functionality too:

e.g. short-circuiting boolean expressions

CIS 341: Compilers 39

Page 40: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Idea: Use a different translation for testsUsual Expression translation:

⟦C ⊢ e : t⟧ = (ty, operand, stream)Conditional branch translation of booleans,

without materializing the value: ⟦C ⊢ e : bool@⟧ ltrue lfalse = stream

Notes:• takes two extra

arguments: a “true”branch label and a “false” branch label.

• Doesn’t “return a value”

• Aside: this is a form ofcontinuation-passingtranslation…

CIS 341: Compilers 40

where⟦C, rt ⊢ s1 ⇒ C’⟧ = ⟦C’⟧, insns1⟦C, rt ⊢ s2 ⇒ C’’⟧ = ⟦C’’⟧, insns2⟦C ⊢ e : bool@ ⟧ then else = insns3

⟦C, rt ⊢ if (e) then s1 else s2 ⇒ C’⟧ = ⟦C’⟧, insns3

then:⟦s1⟧br %merge

else:⟦s2⟧br %merge

merge:

Page 41: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Short Circuit Compilation: Expressions• ⟦C ⊢ e : bool@⟧ ltrue lfalse = insns

Zdancewic CIS 341: Compilers 41

⟦C ⊢ false : bool@⟧ ltrue lfalse = [br %lfalse]

⟦C ⊢ true : bool@⟧ ltrue lfalse = [br %ltrue]

⟦C ⊢ !e : bool@⟧ ltrue lfalse = insns

⟦C ⊢ e : bool@⟧ lfalse ltrue = insns

FALSE

TRUE

NOT

Page 42: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Short Circuit EvaluationIdea: build the logic into the translation

Zdancewic CIS 341: Compilers 42

insns1right:

insn2

where right is a fresh label

⟦C ⊢ e1|e2 : bool@⟧ ltrue lfalse =

⟦C ⊢ e1 : bool@⟧ ltrue right = insns1 ⟦C ⊢ e2 : bool@⟧ ltrue lfalse = insns2

insns1right:

insn2

⟦C ⊢ e1 : bool@⟧ right lfalse = insns1 ⟦C ⊢ e2 : bool@⟧ ltrue lfalse = insns2

⟦C ⊢ e1&e2 : bool@⟧ ltrue lfalse =

Page 43: Lecture 15 CIS 341: COMPILERS - seas.upenn.edu

Short-Circuit Evaluation• Consider compiling the following program fragment:

if (x & !y | !w) z = 3;

else z = 4;

return z;

CIS 341: Compilers 43

%tmp1 = icmp Eq ⟦x⟧, 0 br %tmp1, label %right2, label %right1

right1:%tmp2 = icmp Eq ⟦y⟧, 0br %tmp2, label %then, label %right2

right2:%tmp3 = icmp Eq ⟦w⟧, 0br %tmp3, label %then, label %else

then:store ⟦z⟧, 3br %merge

else:store ⟦z⟧, 4br %merge

merge:%tmp5 = load ⟦z⟧ret %tmp5