Precise Interprocedural Analysis using Random Interpretation Sumit Gulwani George Necula...

Precise Interprocedural Analysis using Random Interpretation

Sumit Gulwani George Necula

UC-Berkeley

2

Random Interpretation

= Random Testing + Abstract Interpretation

• Almost as simple as random testing but better soundness guarantees.

• Almost as sound as abstract interpretation but more precise, efficient, and simple.

3

Example

a := 0; b := i;

a := i-2; b := 2;

c := b – a; d := i – 2b;

assert(c+d = 0); assert(c = a+i)

c := 2a + b; d := b – 2i;

True False

False

•Random testing needs to execute all 4 paths to verify assertions.

• Abstract interpretation analyzes statements once but uses complicated operations.

•Random interpretation simply executes program once (and captures effect of all paths).

True

*

*

4

Outline

• Framework for intraprocedural random interpretation– Advantages

•Investigate all analyses using one framework•Design and proof of new analyses will be

simpler

• A generic algorithm for interprocedural analysis

5

Outline

• Framework for intraprocedural random interpretation– Affine join function– Eval function– Example

• A generic algorithm for interprocedural analysis

6

Random Interpretation framework

Goal: Detect equivalences of expressions.

Generic Algorithm:

• Choose random values for input variables.

• Execute assignments.

– Using Eval function to evaluate expressions.

• Execute both branches of conditionals and combine the program states at join points.

– Using Affine Join function.

• Compare values of expressions to decide equality.

7

Affine Join function

Used for combining program states at join points.w: State £ State ! State

Let = w(1,2). Then,

(y) =def w£1(y) + (1-w)£2(y)

2: [a=4, b=1]1: [a=2, b=3]

a := 2; b := 3;

a := 4; b := 1;

= 7(1,2): [a=7¢2 + (1-7)¢4, b=7¢3 +(1-7)¢1] i.e. [a=-10, b=15]

8

2: [a=4, b=1]1: [a=2, b=3]

Properties of Affine Join

• Affine join preserves common linear relationships e.g. a+b=5.

• It does not introduce false relationships w.h.p.

a := 2; b := 3;

a := 4; b := 1;

= 7(1,2): [a=7¢2 + (1-7)¢4, b=7¢3 +(1-7)¢1] i.e. [a=-10, b=15]

9

Eval function

Eval: Expression £ State ! Value• Used for executing expressions• Defined in terms of Poly: Expression ! Polynomial• Poly is abstraction specific

Eval(e,) = Evaluation of Poly(e) using and random choices for non-program variables

Poly must satisfy:

• Correctness: Poly(e1) = Poly(e2) iff e1 = e2

• Linearity: Poly(e) is linear in program variables.

10

Example of Poly function

• Linear Arithmetic (POPL 2003)Expression e := y | e1 § e

2 | c¢e

Poly(e) = e

• Uninterpreted Functions (POPL 2004)Expression e := y | F(e)Poly(y) = yPoly(F(e)) = a £ Poly(e) + b

Example: Random Interpretation for Linear Arithmetic

i=3, a=0, b=3

i=3

a := 0; b := i;

a := i-2; b := 2;

c := b – a; d := i – 2b;

assert (c+d = 0); assert (c = a+i)

i=3, a=-4, b=7

i=3, a=-4, b=7c=23, d=-23

c := 2a + b; d := b – 2i;

i=3, a=1, b=2

i=3, a=-4, b=7c=-1, d=1

i=3, a=-4, b=7 c=11, d=-11

False

False

w1 = 5

w2 = 2

True

True*

*

12

Outline

• Framework for intraprocedural random interpretation– Affine join function– Eval function– Example

• A generic algorithm for interprocedural analysis– Random summary (Idea #1)– Issue of freshness (Idea #2)– Error probability and complexity– Experiments

i=3, a=0, b=3

i=3

a := 0; b := i;

a := i-2; b := 2;

c := b – a; d := i – 2b;


i=3, a=-4, b=7

i=3, a=-4, b=7c=23, d=-23

c := 2a + b; d := b – 2i;

i=3, a=1, b=2

i=3, a=-4, b=7c=-1, d=1

i=3, a=-4, b=7 c=11, d=-11

False

False

w1 = 5

w2 = 2

Example

True

True*

*

• The second assertion is true in the context i=2.

•We need two new ideas to make the analysis interprocedural.

i=2

a=0, b=i

a := 0; b := i;

a := i-2; b := 2;

c := b – a; d := i – 2b;


a=8-4i, b=5i-8

a=8-4i, b=5i-8c=21i-40, d=40-21i

c := 2a + b; d := b – 2i;

a=i-2, b=2

a=8-4i, b=5i-8c=8-3i, d=3i-8

a=8-4i, b=5i-8 c=9i-16, d=16-9i

False

False

w1 = 5

w2 = 2

Idea #1: Keep input variables symbolic

•Do not choose random values for input variables (to later instantiate by any context).

• Resulting program state at the end is a random summary.

a=0, b=2c=2, d=-2

True

True

*

*

15

Idea #2: Generate fresh summaries

u = 5¢2 -7 = 3v = 5¢1 -7 = -2w = 5¢1 -7 = -2

x = 5i-7

w = 5 x = 3x = i+1

x := i+1;

x := 3;

return x;

*

Procedure P Input: i

Assert (u = 3);Assert (v = w);

u := P(2); v := P(1); w := P(1);

Procedure Q

•Plugging the same summary twice is unsound.

•Fresh summaries can be generated by random affine combination of few independent summaries!

True False

16

Generating 2 random summaries for P

Procedure P

x=[5i-7,7-2i]

w=[5,-2]

x = [3,3]x=[i+1,i+1]

x := i+1;

x := 3;

return x;

*

Input: i

True False

x = 7(5i-7,7-2i) = 47i-91

x = 6(5i-7,7-2i) = 40i-77

x = 2(5i-7,7-2i) = 19i-35

x = 0(5i-7,7-2i) = 7-2i

x = 5(5i-7,7-2i) = 33i-63

x = 1(5i-7,7-2i) = 5i-7Procedure Q calls P 3 times. Hence, generating 2 random summaries for Q requires 2£3 fresh summaries of P.

17

Generating 2 random summaries for Q

u = [47¢2-91, 40¢2-77] =[3,3]v = [19¢1-35, 7-2¢1] =[-16,5]w = [33¢1-63, 5¢1-7] =[-30,-2]

Assert (u = 3);Assert (v = w);

u := P(2); v := P(1); w := P(1);

Procedure Qx = 7(5i-7,7-2i) = 47i-91

x = 6(5i-7,7-2i) = 40i-77

x = 2(5i-7,7-2i) = 19i-35

x = 0(5i-7,7-2i) = 7-2i

x = 5(5i-7,7-2i) = 33i-63

x = 1(5i-7,7-2i) = 5i-7

18

Loops and Fixed point computation

• In presence of loops (in procedures and call-graphs), fixed point computation is required.

• The number of iterations required to reach fixed point is kv(2kI+1) + 1

kv: # of visible variables

kI: # of input variables

19

Error Probability and Complexity

Time Complexity = nkVkI2t

Error probability = 1/qt-m

n: size of programkV, kI: # of visible and input variables

t: # of random summariesq: size of set from which random values are chosenm: kI kV (generic bound)

kI + kV (for linear arithmetic)

4 (for unary uninterpreted functions)

20

Related Work

• Intraprocedural random interpretation– Linear arithmetic (POPL 03)– Uninterpreted functions (POPL 04)

• Interprocedural dataflow analysis (POPL 95, TCS 96)– Sagiv, Reps, Horwitz– Cons: simpler properties, e.g. liveness, linear

constants– Pro: better computational complexity

• Interprocedural linear arithmetic (POPL 04)– Muller-Olm, Seidl– Cons: O(k2) times slower– Pro: works for non-linear relationships too

21

Related Work

• Intraprocedural random interpretation– Linear arithmetic (POPL 03)– Uninterpreted functions (POPL 04)

• Interprocedural dataflow analysis (POPL 95, TCS 96)– Sagiv, Reps, Horwitz– Cons: simpler properties, e.g. liveness, linear

constants– Pro: better computational complexity

• Interprocedural linear arithmetic (POPL 04)– Muller-Olm, Seidl– Cons: O(k2) times slower– Pro: works for non-linear relationships too

22

Experiments

Prog Line

Inp Var Time

go 29K

63 1700

47

ijpeg 28K

31 825 4

li 23K

53 392 34

gzip 8K 49 525 2

Random Inter(this paper)

Random Intra(POPL 2003)

Det Inter(TCS 96)

Var) Speedup

170 107

34 24

160 756

200 39

Inp)

Speedup

17 1.9

3 2.3

20 1.3

6 2.0•Inp: # of input variables that were constants•Var: # of local variable that were constants• (Var): # of fewer local variable constants discovered

Random Inter discovers 10-70% more facts; Random Intra is faster by 10-500 times; Det Inter is faster by 2 times.

23

Conclusion

• Randomization buys efficiency, simplicity at cost of probabilistic soundness.

• Combining randomized techniques with symbolic techniques is powerful.

Precise Interprocedural Analysis using Random Interpretation Sumit Gulwani George Necula...

Documents

Transcript of Precise Interprocedural Analysis using Random Interpretation Sumit Gulwani George Necula...