Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21...

33
Program Analysis Mooly Sagiv http://www.math.tau.ac.il/~sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber 317 Textbook: Principles of Program Analysis Chapter 1.1-5
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21...

Page 1: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Program AnalysisMooly Sagiv

http://www.math.tau.ac.il/~sagiv/courses/pa.html

Tel Aviv University

640-6706

Sunday 18-21 Scrieber 8

Monday 10-12 Schrieber 317

Textbook: Principles of Program Analysis

Chapter 1.1-5

Page 2: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Outline The Nature of Program Analysis Setting the Scene

– The While language

– Reaching Definitions Program Analysis Techniques

– Data Flow Analysis - the equational approach

– The Constraint Based Approach

– Abstract Interpretation

– Type and Effect Systems

– Algorithms

– Transformations

Page 3: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The Nature of Program Analysis

Compile-time techniques for predicating safe and computable approximations to the behaviors arising at tun-time when executing a program

Differences with operational semantics– The input state is not usually known at compile-time

– The compiler must always terminate (fast)

– The compiler can generate suboptimal code

Page 4: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The Nature of Program AnalysisErring on the Safe Side

{d1, d2, …, dN}{d1, d2, …, dn dN}

true-answer

{d1, d2, …, dn, dn+1, … dn+m , dN}

safe-answer

Page 5: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Examplevoid main()

{ int y, z;

read(x);

if (x>0) then

y = 1;

else {

y = 2;

f() ; /* f does not change y */

}

z = y;

}

Page 6: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Semantics Based Program Analysis

Information obtained can be proved safe (or correct) w.r.t. operational semantics

Earlier detection of conceptual compiler bugs But not committing to semantics directed program

analysis– The structure of the program analysis algorithm need

reflect the structure of the semantics

Page 7: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The While Programming Language RevisitedSyntactical Categories

x, y Var program variables

n Num program numerals

a Aexp arithmetic expressions

b Bexp Boolean expressions

s Stm set of program statements

l Lab set of program labels opa Opa arithmetic operators

opb Opb Boolean operators

opr Opb relational operators

Page 8: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The While Programming Language RevisitedAbstract Syntax

a := x | n | a1 opa a2

b := true | false | not b | b1 opb b2 | a1 opr a2

S := [x := a]l | [skip] l | S1 ; S2 | if [b]l then S1 else S2 | while [b]l do S

Page 9: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The Factorial Program[y := x]1;[z := 1]2;

while [y>1]3 do (

[z:= z * y]4;

[y := y - 1]5;

)

[y := 0]6;

Page 10: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Example Program Analysis ProblemReaching Definitions

An assignment (definition) of the form [x := a] l may reach an elementary block l’ if– there is execution of the program that leads to l'

where x was last assigned at l

Page 11: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Reaching Definitions in Factorial [y := x]1;[z := 1]2;

while [y>1]3

do (

[z:= z * y]4;

[y := y - 1]5;

)

[y := 0]6;

{(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)}

{(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 5), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 4)} {(x, ?), (y, 5), (z, 4)}

{(x, ?), (y, 6), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

Page 12: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Reaching Definitions in Factoriall RDentry (l) RDexit(l)

1 {(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)}

2 {(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)}

3 {(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 5),(z, 2), (z, 4)}

4 {(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 5),(z, 4)}

5 {(x, ?), (y, 1), (y, 5), (z, 4)} {(x, ?), (y, 5), (z, 4)}

6 {(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)} {(x, ?), (y, 6), (z, 4)}

Page 13: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Usage of Reaching Definitions

Compiler optimizations– An occurrence of a variable x in in an elementary block

l is constant n if all in the reaching definitions (x, l'),l' assigns n to x

– Loop invariant code motion

– Program dependence graphs

Software quality tools– A usage of a variable x in an elementary block may be

uninitialized if ...

– Program slicing

Page 14: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Soundness in Reaching Definitions

Every reachable definition is detected May include more definitions

– Less constants may be identified

– Not all the loop invariant code will be identified

– May warn against uninitailzed variables that are in fact in initialized

But never miss a reaching definition– All constants are indeed such

– Never move a non invariant code

– Never miss an error

Page 15: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Reaching Definitions in Factorial [y := x]1;[z := 1]2;

while [y>1]3

do (

[z:= z * y]4;

[y := y - 1]5;

)

[y := 0]6;

{(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)}

{(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 5), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 4)} {(x, ?), (y, 5), (z, 4)}

{(x, ?), (y, 6), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

Page 16: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Unsound Reaching Definitions 1[y := x]1;[z := 1]2;

while [y>1]3

do (

[z:= z * y]4;

[y := y - 1]5;

)

[y := 0]6;

{(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)}

{(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

{(x, ?), (y, 1), (z, 2)} {(x, ?), (y, 1), (z, 4)}

{(x, ?), (y, 1), (z, 4)} {(x, ?), (y, 5), (z, 4)}

{(x, ?), (y, 6), (z, 2), (z, 4)}

{(x, ?), (y, 1), (z, 2)}

Page 17: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Unsound Reaching Definitions 2[y := x]1;[z := 1]2;

while [y>1]3

do (

[z:= z * y]4;

[y := y - 1]5;

)

[y := 0]6;

{(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)}

{(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)}

{(x, ?), (y, 5), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)} {(x, ?), (y, 1), (y, 5), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 4)} {(x, ?), (y, 5), (z, 4)}

{(x, ?), (y, 6), (z, 4)}

{(x, ?), (y, 1), (y, 5), (z, 2), (z, 4)}

Page 18: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Suboptimal Reaching Definitions [y := x]1;[z := 1]2;

while [y>1]3

do (

[z:= z * y]4;

[y := y - 1]5;

)

[y := 0]6;

{(x, ?), (y, ?), (z, ?)} {(x, ?), (y, 1), (z, ?)}

{(x, ?), (y, 1), (z, ?)} {(x, ?), (y, 1), (z, 2)}

{(x, ?), (y, 1), (y, 5), (y, 6), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (y, 6), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (y, 6), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 6), (y, 5), (z, 4)}

{(x, ?), (y, 1), (y, 5), (y, 6), (z, 4)} {(x, ?), (y, 5), (z, 4)}

{(x, ?), (y, 6), (z, 2), (z, 4)}

{(x, ?), (y, 1), (y, 5), (y, 6), (z, 2), (z, 4)}

Page 19: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Program Analysis Techniques Find sound solutions Data Flow Analysis - the equational approach The Constraint Based Approach Abstract Interpretation Type and Effect Systems

Page 20: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The Dataflow Analysis Approach

Generate a system of equations Find the least solution in one of the following

ways– Start with the minimum element and iterate until no

more changes occur

– Eliminate equations until every value is expressed in terms of the initial dataflow value when the program begins (not studied in this course)

Page 21: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Equations Generated for Reaching Definitions Equations for elementary statements

– [skip]l

RDexit(1) = RDentry(l)

– [b]l

RDexit(1) = RDentry(l)

– [x := a]l

RDexit(1) = (RDentry(l) - {(x, l) | l Lab }) {(x, l)}

Equations for control flow constructs RDentry(l) = RDexit(l’) l’ immediately precedes l in the control flow graph

An equation for the entryRDentry(1) = {(x, ?) | x is a variable in the program}

Page 22: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The Factorial Program[y := x]1;[z := 1]2;

while [y>1]3 do (

[z:= z * y]4;

[y := y - 1]5;

)

[y := 0]6;

Page 23: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The Least Solution

12 sets of equationsRDentry(1), …, RDexit (6)

Can be written in vectorial form

Find the minimum solution Every component is minimal Since F is monotonic such a solution always exists Since the number of definitions is finite it is possible

to compute the minimum solution iteratively

)RD(RD F

Page 24: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Chaotic Computation of the Least SolutionInitialize RDentry(1)={(x, ?), (y, ?), (z, ?)}|

RDentry(2)= RDentry(3)= RDentry(4)= RDentry (5)= RDentry (6)=

RDexit (i)= WL = {1, 2, 3, 4, 5, 6}

while WL != do select and remove an l from WL new = FRDexit(l)(…) if (new != RDexit(l)) then

RDexit(l) = new

for all l’ such that RDexit(l) is used in FRDentry(l’)(…) do RDentry(l’) = RDentry(l’) new

WL := WL {l’}

Page 25: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The Constraint Based Approach

Generate a system of set inclusions X Y

Fits very well with functional and object oriented programming languages in which the control flow graph is not immediately derived from the syntax

Find the least solution

Page 26: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Constraints Generated for Reaching Definitions Constrains for elementary statements

– [skip]l

RDexit(1) RDentry(l)

– [b]l

RDexit(1) RDentry(l)

– [x := a]l

RDexit(1) (RDentry(l) - {(x, l) | l Lab })

RDexit(1) {(x, l)}

Equations for control flow constructs RDentry(l) = RDexit(l’) l’ immediately precedes l in the control flow graph

An equation for the entryRDentry(1) {(x, ?) | x is a variable in the program}

Page 27: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Constraint vs.. Equations Reaching Definitions

Every solution to the system of equations is a solution to the set of constraintsRDexit(1) (RDentry(l) - {(x, l) | l Lab })

RDexit(1) {(x, l)}

RDexit(1) (RDentry(l) - {(x, l) | l Lab }) {(x, l)}

But some solutions to the set of constraints are not solutions to the system of equations

The least solution is the same The connection between constraints and equations is not

always obvious

Page 28: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

The Control Flow Analysis Problem

Given a program in a functional programming language with higher order functions(functions can serve as parameters and return values)

Find out for each function invocation which functions may be applied

Obvious in C without function pointers Difficult in C++, Java and ML

Page 29: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

An ML Example

let f = fn x => x 1 ;

g = fn y => y + 2 ;

h = fn z => z + 3;

in (f g) + (f h)

Page 30: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

An ML Example

let f = fn x => /* {g, h} */ x 1 ;

g = fn y => y + 2 ;

h = fn z => z + 3;

in (f g) + (f h)

Page 31: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Control Flow Analysis (pure)ML Find out for every formal argument x

the set of expressions that may be bound to x in some execution

Analyze all function invocations Generate a set of constraints

– Label every program sub-expression

– The Control Flow Analysis Algorithm needs to find a pair (C, p) where

» C(l) is a superset of the potential sub-expressions that can occur at l

» p(x) is a superset of the potential sub-expressions that x can be bound to

Generate constraints for (C, p)

Page 32: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Simplified Example

let f = [ fn x => [[x]1 1]2]3; g = [ fn y =>[[y]4 + 2]5]6; h = [ fn z =>[[z]7 + 3]8]9;in [f h] 10

Page 33: Program Analysis Mooly Sagiv sagiv/courses/pa.html Tel Aviv University 640-6706 Sunday 18-21 Scrieber 8 Monday 10-12 Schrieber.

Simplified Constraints let f = [ fn x => [[x]1 1]2]3; g = [ fn y =>[[y]4 + 2]5]6; h = [ fn z =>[[z]7 + 3]8]9;in [f h] 10

C(1) { [x]1} C(2) { [[x]1 1]2

C(10) {[f h]10} C(1) p(x)C(4) p(y)C(7) p(z)p(x) C(9)C(10) C(3)