Analysis of Software Eric Feron 6.242 From "Semantic Foundations of Program Analysis" by P. Cousot...

40
Analysis of Software Eric Feron 6.242 From "Semantic Foundations of Program Analysis" by P. Cousot in "Program Flow Analysis Theory and Applications" Muchnik & Jones Eds. 1981 Prentice Hall
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    220
  • download

    2

Transcript of Analysis of Software Eric Feron 6.242 From "Semantic Foundations of Program Analysis" by P. Cousot...

Analysis of Software

Eric Feron

6.242From "Semantic Foundations of Program Analysis"

by P. Cousotin "Program Flow Analysis

Theory and Applications"

Muchnik & Jones Eds.

1981 Prentice Hall

Main message

• Traditional dynamical systems analysis tools can apply to certain aspects of software analysis, incl. run-time errors.

• Most characteristics (eg overflow errors) cannot be detected using straight program and variables: Too many computations or computations are not even conceivable.

• Tractability can be achieved via use of abstractions.

• Tractability can be achieved via use of overbounding invariant sets.

Prototype program[1]

while x> 1000 do

[2]

x:=x+y ;

[3]

od;

[4]

(x,y) in I=[-b-1;b] integer

b is overflow limit.

Program characteristics:•Program terminates without error iff(x0<1000) v (y0<0).•Execution never terminates iff (1000< x0<b)(y0=0).•Execution leads to run-time error (by overflow) iff (x0>1000)(y0>0).

These are desirable characteristics to be found

Graph representations of programs• Programs are single-entry, single exit directed

graphs

• Edges labeled with instructions.

• Program graph: <V,E>

– V finite set of vertices

– E finite set of edges

– entry and exit vertices.

• Variables

– live in universe U.

– Ia(U): assignments. v=f(v) from U to U.

– It(U) are tests, ie are maps from U to B={true,false}

• Program

– a triple <G,U,L>. G is program graph, U is universe, and L is edge labeling with instructions.

1

2

3

4

if x> 1000if x<1000

if x>1000<x,y> < <x+y,y>

if x<1000

Programs as dynamical systems• StatesSet S of states is set of pairs <c,m> with c in V {} defined as control state. m in U is the

memory state. is the error control state.

• State transition functionProgram G,U,L) defines state transition function as follows:

– m>) = <m> (can't recover from run-time error)– m>) = <m> (once done, we’re done)– If c1 in V has out-degree 1, <c1,c2> in E, L(<c1,c2>) = f, f in

Ia(U) then if m is in dom(f) then c1m>) = <c2,f(m)> else c1m>) = <m>.

– If c1 in V has out-degree 2, <c1,c2> in E, <c1,c3> in E,L(<c1,c2>)=p, L(<c1,c2>)=¬p, with p in It(U) then if m is not in dom(p) then c1m>) = <m>, else if p(m) then c1m>) = <c2m> else c1m>) = <c3m>.

• State transition relation: It's the graph of the state transition function (a boolean function over SxS)

Programs as dynamical systems (ct'd)

• Transitive closure of binary relation:

assume (SxS B) are two binary relations on S. Their product is defined as { s3 S : (s1,s3) (s3,s2)}

So we can talk about the n-extension n of . The transitive closure of is then s1,s2>.[n > 0: n(s1,s2)]

Example of complete latticeSet L of subsets of states in a state-space S:

Partial order is traditional inclusionH = {H1,H2} LH1 U H2 is the least upper bound for H.H1 H2 is the greatest lower bound for H.

Obviously these exist for any H.

L has an infimum: The empty setL has a supremum: S.

H1 H2

Abstracting state spaces

{Set of all subsets of signed integer numbersbetween -b-1 and b}

- +

0

if x = T then x is any valueif x = + then 0<x<bif x = 0 then x = 0if x = - then -b-1<x<0if x = then

Rules:+ + + = +; + + - = T- - + = -; -*- = +; …

Effect: Go from huge state-space decompositions to finite and simple state-space decomposition

Abstracting state spaces{Set of all subsets of Rn}

{Set of all ellipsoids in Rn + Ø + Rn}

Operations are traditional union/intersections/sums anddifferences

What a mess….

Operations are (conservative)union of ellispoids, intersect-ions of ellispoids, sums of ellipsoids. The job itself is most often nonconvex. Usuallyrelaxed based on convex optimization.

Lattices of ellipsoids• Set Ell of ellipsoids centered around zero for simplification.• Partial order on ellipsoids: Set inclusion (that's a classic),and volume. • Ellipsoid theorems: H finite set of p ellipsoids (E1, …., Ep)

characterized by Ei={x | xTPix < 1}• Minimum volume ellipsoid h containing H exists and is computed as

follows:if p = 0 then h = .if p>0 then h = {x | xTPx < 1} where P = argmin log det (P-1)

s.t. P< Pi , i=1,…,p• Maximum volume ellipsoid contained H also exists and is computable.Ell is a complete lattice then.

Rules of operations with ellipsoids (centered around zero)

Ellipsoid given by {x | xTPx < 1}• Finding an ellipsoidal lowest upper bound Ell(K) on

any set K of data in Rn:Assume set is described by finite list of points (xi, i=1, …, p).

If p=0 then Ell(K)=.

If p>0 then Ell(K)= argmin log det P-1

Subject to xiTPxi < 1

• Finding an approximate ellipsoidal lowest upper bound E3 on the sum of two ellipsoids E1 and E2 (characterized by P1 and P2 ) is a convex, semidefinite program that goes like

0

1 t.s.

Pdet log min

233

311 3

1

-13

PPP

PPP

where < is to be understoodin the sense of P.D. matrices

Reasoning with abstractions<x1 ,y1 > = <x2 ,y2 >=smash(<if (x1 x3 > - then else + fi, y1 y3 )

<x3 ,y3 >=smash(<x2 +y2 , y2 >)

<x4 ,y4 >=smash(x1 x3 ,y1 y3 )

<x ,y >=if (x2 0)(y2 0)((x2 = + )(y2 = - ))

((x2 = - ) (y2 = + )) then <,> else <x2 ,y2> fi

Start iterating with: all states at and = <+,->. In steady state, reach in a few iterations: <x1 ,y1 > = <x2 ,y2 > = <x3 ,y3 > = <x4 ,y4 > = <x ,y >=<,>.

Thus if x>0 and y< 0 then no overflow can occur.

[1]

while x> 1000 do

[2]

x:=x+y ;

[3]

od;

[4]

(x,y) in I=[-b-1;b] integer

Ellispoidal reachability analysis: one example the "star norm"

Consider the programx:= 0 %An integer vectorn:=0while n< 1000, x := Ax +Bu%A is a matrix n:= n+1end;x:=x

uis exogenous, bounded input, changes at each iteration.

Question:For which values of y does thestate x not overflow?

The exact answer involves computing the || . ||1 norm of the system (A,B,I). This norm is not easy to derive analytically.

Choice of abstractionsNew lattice for system abstractionsSet of ellipsoids centered around zero

Abstract interpretationx:= 0 n:=0while n< 1000, x := (A Ell(x) + B Ell(u)) n:= n+1end;x:=x

There remains only to check that the ellipsoid xis within bounds.

Abstractions for other applications Abstracted constrained optimization (Williams)

Consider the nonlinear optimization problem:Minimize f(x)Subject to gi(x) < 0 xRn

Kuhn-Tucker conditions (assume differentiability of function, constraints and qualification of all these constraints)i> 0 such that at optimum x*, d/dx (f(x*)+ i gi(x)) = 0

Approximate analysis of optimization problems: Abstraction of y R: y {-,+, 0,}.

Abstractions for other applications Abstracted constrained optimization (Williams)

Abstracted Kuhn-Tucker conditions

i> 0 such that at optimum x*, d/dx (f(x*)+ i gi(x)) = 0

Approximate analysis of optimization problems: Abstraction of y R: y {-,+, 0,}.