Automating Construction of Provably Correct Software Viktor Kuncak EPFL School of Computer and...
-
Upload
hugo-pitts -
Category
Documents
-
view
219 -
download
0
Transcript of Automating Construction of Provably Correct Software Viktor Kuncak EPFL School of Computer and...
Automating Construction ofProvably Correct Software
Viktor KuncakEPFL School of Computer and Communication Sciences
Laboratory for Automated Reasoning and Analysis
http://lara.epfl.ch
Your Wish is my Command
wish
11011001 0101110111011001 0101110111011001 0101110111011001 01011101
requirementformalization
conventionalcompilation
implementation (program): p
specification (constraint): C
How to automatically transformspecifications into implementations?
This Talk
Command
Given a list of numbers, make this list sorted
8900
6000
24140
2900
2900
6000
8900
24140
Example Wish: Sorting
wish
8900 > 6000 2900 < 6000
6000 < 8900
8900 < 24140
input output
Given a list of numbers, make this list sorted
8900
6000
24140
2900
2900
6000
8900
24140
Sorting Specification as a Program
wish
8900 > 6000 2900 < 6000
6000 < 8900
8900 < 24140
def sort_spec(input : List, output : List) : Boolean = content(output)==content(input) /\ isSorted(output)
input output
Specification (for us) is a program that checks, for a given input, whether the given output is acceptable
8900
6000
24140
2900
2900
6000
8900
24140
Specification vs Implementationdef C(i : List, o : List) : Boolean = content(o)==content(i) /\ isSorted(o)
input output
implementation
specification
true / false
def p(i : List) : List = sort i using a sorting algorithm and return the result
U p C
more behaviors
fewer behaviors
constraint on the output
function that computes the output
Synthesizing Sort in Leon System
Ivan Kuraj Philippe SuterEtienne Kneuss
OOPSLA 2013: Synthesis Modulo Recursive Functions
http://leon.epfl.ch
Example Results
Techniques used:– Leon’s verification capabilities– synthesis for theory of trees– recursion schemas– case splitting– symbolic exploration of the
space of programs– synthesis based on type
inhabitation– fast falsification using previous
counterexamples– learning conditional
expressions– cost-based search over possible
synthesis steps
Approaches and Their Guarantees
a) Check assertion while program p runs: C(i,p(i))
c) Constraint programming: once i is known, find o to satisfy a given constraint: find o such that C(i,o)
b) Verify whether program always meets the spec:
i. C(i,p(i))
d) Synthesis: solve C symbolically to obtain program p that is correct by construction, for all inputs: find p such that i.C(i,p(i)) i.e. p Crun-time compile-time
both specification C and program p are given:
only specification C is given:
Runtime Assertion Checking
a) Check assertion while program p runs: C(i,p(i))
def content(lst : List) = lst match { case Nil() Set.empty⇒ case Cons(x, xs) Set(x) ++ content(xs)⇒}def isSorted(lst : List) = lst match { case Nil() true⇒ case Cons(_, Nil()) true⇒ case Cons(x, Cons(y, ys)) ⇒ x < y && isSorted(Cons(y, ys))}
def p(i : List) : List = { sort i using a sorting algorithm and return the result} ensuring (o content(i)==content(o) && isSorted(o))⇒
Already works in Scala!Key design decision: constraints are programs
Ongoing:high-level optimization of run-time checks
Can we give stronger guarantees? prove postcondition always true
Static Verification in Leonb) Verify that program always meets spec: i. C(i,p(i))
def content(lst : List) = lst match { case Nil() Set.empty⇒ case Cons(x, xs) Set(x) ++ content(xs)⇒}def isSorted(lst : List) = lst match { case Nil() true⇒ case Cons(_, Nil()) true⇒ case Cons(x, Cons(y, ys)) ⇒ x < y && isSorted(Cons(y, ys))}
def p(i : List) : List = { sort i using a sorting algorithm and return the result} ensuring (o content(i)==content(o) && isSorted(o))⇒
Type in a Scala program and spec, see it verified
timeout
proof of i. C(i,p(i))
input i such thatnot C (i,p(i))
Insertion Sort Verified as You Type It
Web interface: http://lara.epfl.ch/leon
Reported Counterexample in Case of a Bug
Approaches and Their Guarantees
a) Check assertion while program p runs: C(i,p(i))
c) Constraint programming: once i is known, find o to satisfy a given constraint: find o such that C(i,o)
both specification C and program p are given:
only specification C is given:
b) Verify that program always meets spec:
i. C(i,p(i))
d) Synthesis: solve C symbolically to obtain program p that is correct by construction, for all inputs: find p such that i.C(i,p(i)) i.e. p Crun-time compile-time
Using Assertions to Compute• what to do when assertion fail?– presumably some values are wrong– what to change (e.g. in repair)
• alternative: leave some variables unknown (logical variables); find their values to satisfy the assertions: constraint programming
• like CLP, but– richer constraints– new compilation techniques (synthesis)– embedded in Scala, no Prolog with "cut"
Programming with Specifications
c) Constraint programming: find a value that satisfies a given constraint: find o such that C(i,o)Method: use verification technology, try to prove that no such o exists, report counter-examples!
Philippe Suter Ali Sinan Köksal
Etienne Kneuss
Sorting a List Using Specificationsdef content(lst : List) = lst match { case Nil() Set.empty⇒ case Cons(x, xs) Set(x) ++ content(xs)⇒}def isSorted(lst : List) = lst match { case Nil() true⇒ case Cons(_, Nil()) true⇒ case Cons(x, Cons(y, ys)) x < y && isSorted(Cons(y,ys))⇒}((l : List) isSorted(lst) && content(lst) == Set(0, 1, -⇒3)) .solve
> Cons(-3, Cons(0, Cons(1, Nil())))
Comparison: Date Conversion in CKnowing number of days since 1980, find current year and dayBOOL ConvertDays(UINT32 days) { year = 1980; while (days > 365) { if (IsLeapYear(year)) { if (days > 366) { days -= 366; year += 1; } } else { days -= 365; year += 1; } ... }
Enter December 31, 2008All music players (of a major brand) froze in the boot sequence.
Date Conversion using Specifications
val origin = 1980 // beginning of the universedef leapsTill(y : Int) = (y-1)/4 - (y-1)/100 + (y-1)/400
val (year, day)=choose( (year:Int, day:Int) => { days == (year-origin)*365 + leapsTill(year)-leapsTill(origin) + day && 0 < day && day <= 366}) // Choose year and day such that the property holds.
• We did not write how to compute year and day• Instead, we gave a constraint they should satisfy• We defined them implicitly, though this constraint• More freedom (can still do it the old way, if needed)• Correctness, termination simpler than with loop
Knowing number of days since 1980, find current year and day
invariants -specification
Implementation:next 30 pages
Formalizing Tree Invariants (in Scala)sealed abstract class Treecase class Empty() extends Treecase class Node(color: Color, left: Tree, value: Int, right: Tree) extends Treedef blackBalanced(t : Tree) : Boolean = t match { case Node(_,l,_,r) => blackBalanced(l) && blackBalanced(r) && blackHeight(l) ==blackHeight(r) case Empty() => true}def blackHeight(t : Tree) : Int = t match { case Empty() => 1 case Node(Black(), l, _, _) => blackHeight(l) + 1 case Node(Red(), l, _, _) => blackHeight(l)}
def rb(t: Tree) : Boolean = t match { case Empty() => true case Node(Black(), l, _, r) => rb(l) && rb(r) case Node(Red(), l, _, r) => isBlack(l) && isBlack(r) && rb(l) && rb(r)} ... def isSorted(t:Tree) = ...
Define Abstraction as ‘tree fold’def content(t: Tree) : Set[Int] = t match { case Empty() => Set.empty case Node(_, l, v, r) => content(l) ++ Set(v) ++ content(r)}
7
4 9
2 5
{ 2, 4, 5, 7, 9 }
We can now define insertion
def insert(x : Int, t : Tree) = choose(t1:Tree => isRBT(t1) && content(t1) = content(t) ++ Set(x))
Objection: it took a lot of effort to write isRBT
Answer:• no more effort than implementation - wrote some functions• these invariants is what drives data structure design• this is how things are explained in a textbook• it promotes reuse!
Evolving the Program
Suppose we have a red-black tree implementation
We only implemented ‘insert’ and ‘lookup’
Now we also need to implement ‘remove’
void RBDelete(rb_red_blk_tree* tree, rb_red_blk_node* z){ rb_red_blk_node* y; rb_red_blk_node* x; rb_red_blk_node* nil=tree->nil; rb_red_blk_node* root=tree->root; y= ((z->left == nil) || (z->right == nil)) ? z : TreeSuccessor(tree,z); x= (y->left == nil) ? y->right : y->left; if (root == (x->parent = y->parent)) { /* assignment of y->p to x->p is intentional */ root->left=x; } else { if (y == y->parent->left) { y->parent->left=x; } else { y->parent->right=x; } } if (y != z) { /* y should not be nil in this case */#ifdef DEBUG_ASSERT Assert( (y!=tree->nil),"y is nil in RBDelete\n");#endif /* y is the node to splice out and x is its child */ if (!(y->red)) RBDeleteFixUp(tree,x); tree->DestroyKey(z->key); tree->DestroyInfo(z->info); y->left=z->left; y->right=z->right; y->parent=z->parent; y->red=z->red; z->left->parent=z->right->parent=y; if (z == z->parent->left) { z->parent->left=y; } else { z->parent->right=y; } free(z); } else { tree->DestroyKey(y->key); tree->DestroyInfo(y->info); if (!(y->red)) RBDeleteFixUp(tree,x); free(y); } #ifdef DEBUG_ASSERT Assert(!tree->nil->red,"nil not black in RBDelete");#endif}
void RBDeleteFixUp(rb_red_blk_tree* tree, rb_red_blk_node* x) { rb_red_blk_node* root=tree->root->left; rb_red_blk_node* w; while( (!x->red) && (root != x)) { if (x == x->parent->left) { w=x->parent->right; if (w->red) {
w->red=0;x->parent->red=1;LeftRotate(tree,x->parent);w=x->parent->right;
} if ( (!w->right->red) && (!w->left->red) ) {
w->red=1;x=x->parent;
} else {if (!w->right->red) { w->left->red=0; w->red=1; RightRotate(tree,w); w=x->parent->right;}w->red=x->parent->red;x->parent->red=0;w->right->red=0;LeftRotate(tree,x->parent);x=root; /* this is to exit while loop */
} } else { /* the code below is has left and right switched from above */ w=x->parent->left; if (w->red) {
w->red=0;x->parent->red=1;RightRotate(tree,x->parent);w=x->parent->left;
} if ( (!w->right->red) && (!w->left->red) ) {
w->red=1;x=x->parent;
} else {if (!w->left->red) { w->right->red=0; w->red=1; LeftRotate(tree,w); w=x->parent->left;}w->red=x->parent->red;x->parent->red=0;w->left->red=0;RightRotate(tree,x->parent);x=root; /* this is to exit while loop */
} } } x->red=0;#ifdef DEBUG_ASSERT Assert(!tree->nil->red,"nil not black in RBDeleteFixUp");#endif}
140 lines of tricky C, evenreusing existing functions
remove using specifications: 2 lines
def remove(x : Int, t : Tree) = choose(t1:Tree => isRBT(t1) && content(t1)=content(t) – Set(x))
The biggest expected payoff:properties are more reusable
Further Features Supported
• computing minimal / maximal solution of constraints value using binary search
• on-the-fly construction of constraints– first-class constraints, like first-class functions– but they can also be syntactically manipulated
• enumeration of all values that satisfy constraint– application in automated test input generation– can be used like Korat tool for test generation
Approaches and Their Guarantees
a) Check assertion while program p runs: C(i,p(i))
c) Constraint programming: once i is known, find o to satisfy a given constraint: find o such that C(i,o)
both specification C and program p are given:
only specification C is given:
b) Verify that program always meets spec:
i. C(i,p(i))
d) Synthesis: solve C symbolically to obtain program p that is correct by construction, for all inputs: find p such that i.C(i,p(i)) i.e. p Crun-time compile-time
Implicit Programming (ERC project)
specification(constraint)implicit
implementation(function)explicit
x2 + y2 = 1
y = sqrt(1-x2) compute missing part of a satisfying assignment (SAT)
i is assignment for some vars of a propositional formula
o is its completion to make formula true
x
y
i
o
i
o
x
U Usynthesis
def secondsToTime(totalSeconds: Int) : (Int, Int, Int) = choose((h: Int, m: Int, s: Int) (⇒ h * 3600 + m * 60 + s == totalSeconds && h ≥ 0 && m ≥ 0 && m < 60 && s ≥ 0 && s < 60 ))
Synthesis for Linear Arithmetic
def secondsToTime(totalSeconds: Int) : (Int, Int, Int) = val t1 = totalSeconds div 3600 val t2 = totalSeconds -3600 * t1 val t3 = t2 div 60 val t4 = totalSeconds - 3600 * t1 - 60 * t3 (t1, t3, t4)
close to a wish
could infer from types
Compile-time warningsdef secondsToTime(totalSeconds: Int) : (Int, Int, Int) = choose((h: Int, m: Int, s: Int) (⇒ h * 3600 + m * 60 + s == totalSeconds && h ≥ 0 && m ≥ 0 && m ≤ 60 && s ≥ 0 && s < 60 ))
Warning: Synthesis predicate has multiple solutions for variable assignment: totalSeconds = 60Solution 1: h = 0, m = 0, s = 60Solution 2: h = 0, m = 1, s = 0
Synthesis for sets (BAPA)
def splitBalanced[T](s: Set[T]) : (Set[T], Set[T]) = choose((a: Set[T], b: Set[T]) (⇒ a.size – b.size ≤ 1 && b.size – a.size ≤ 1 && a union b == s && a intersect b == empty ))
def splitBalanced[T](s: Set[T]) : (Set[T], Set[T]) = val k = ((s.size + 1)/2).floor val t1 = k val t2 = s.size – k val s1 = take(t1, s) val s2 = take(t2, s minus s1) (s1, s2) a
b
s
PhilippeSuter
RuzicaPiskac
MikaelMayer
balanced
partition
we can conjoin specs
Synthesis for Theories 3 i + 2 o = 13 o = (13 – 3 i)/2• Wanted: "Gaussian elimination" for programs
– for linear integer equations: extended Euclid’s algorithm– need to handle disjunctions, negations, more data types
• For every formula in e.g. Presburger arithmetic– synthesis algorithm terminates– produces the most general precondition
(assertion characterizing when the result exists)– generated code always terminates and gives correct result
• If there are multiple or no solutions for some input parameters, the algorithm identifies those inputs
• Works not only for arithmetic but also for e.g. sets with sizes and for trees
• Goal: lift everything done for SMT solvers to synthesizers
assert(i % 2 == 1)
Decision & Synthesis ProceduresFor a well-defined class of formulas:
Decision procedure
• Input: a formula
Synthesis procedure
• Input: a formula, with input and output variables
• Output: a modelof the formula
• Output: a program to compute output values from input values
5a + 7x = 31
a ↦ 2x ↦ 3
Inputs: { a } outputs: { x }5a + 7x = 31
x ↦ (31 – 5a) / 7
(model-generating)
a theorem prover that always succeeds a synthesizer that always succeeds
33
Framework: Transforming Relations
34
⟦ a̅ ⟨ C1 ⟩ x ̅ ⟧ ⊦ ⟦ a̅ ⟨ C2 ⟩ x ̅ ⟧
∀ a̅, x ̅. C2 ⇒ C1 Refinement
∀ a̅. (∃ x ̅ : C1) (⇒ ∃ x ̅ : C2)Domain preservation
Input variables
Output variablesSynthesis predicate
Programs as Relations
35
⟦ a̅ ⟨ C ⟩ x ̅ ⟧ ⊦ ⟨ P | T ̅ ⟩
Input variables
Output variables Precondition
Program terms
∀ a̅. P ⇒ C[x ̅ ↦T ̅]Refinement
∀ a̅. (∃ x ̅ : C) ⇒ PDomain preservation
P (∧ x ̅ = T ̅)Represents the relation:
Compare to Quantifier Elimination• A problem of the form:
• Corresponds to constructively solving the quantifier elimination problem:
⟦ a̅ ⟨ C ⟩ x ̅ ⟧
∃ x ̅ : C( a ̅, x ̅ )
36
• In the solution, P corresponds to the result of Q.E. and T ̅ are witness terms.
⟨ P | T ̅ ⟩
Transforming Relations
37
⟦ a ̅ ⟨ C[x0↦t] ⟩ x ̅ ⟧ ⊦ ⟨ P | T ̅ ⟩ x0 vars(∉ t)
⟦ a ̅ ⟨ x0 = t ∧ C ⟩ x0;x# ⟧ ⊦ ⟨ P | let x ̅ := T ̅ in t;x ̅ ⟩One-Point
⟦ a ̅ ⟨ C1 ⟩ x ̅ ⟧ ⊦ ⟨ P | T ̅ ⟩ C1 ⇔ C2
⟦ a ̅ ⟨ C2 ⟩ x# ⟧ ⊦ ⟨ P | T ̅ ⟩Equivalence
⟦ a ̅ ⟨ C1 ⟩ x ̅ ⟧ ⊦ ⟨ P1 | T ̅1 ⟩ ⟦ a ̅ ⟨ C2 ⟩ x ̅ ⟧ ⊦ ⟨ P2 | T ̅2 ⟩
⟦ a ̅ ⟨ C1 ∨ C2 ⟩ x# ⟧ ⊦ ⟨ P1 ∨ P2 | if(P1) T ̅1 else T ̅2 ⟩Case-Split
Synthesis for Linear Integer Arithmetic
38
⟦ a ⟨ 7t ≤ a ∧ 5a ≤ 12t ⟩ t ⟧ ⊦
⟦ a ⟨ 5x + 7y = a ∧ 0 ≤ x x ∧ ≤ y ⟩ x,y ⟧ ⊦
5x + 7y = aOne-dimensional solution space.
x = -7t + 3ay = 5t – 2a
is a solution for any t.
7t ≤ a ∧ 5a ≤ 12t
t is bound on both sides, and admits a solution whenever
⌈5a/12 ≤ ⌉ ⌊3a/7⌋
⟨ ⌈5a/12 ≤ ⌉ ⌊3a/7 ⌋ | let t = ⌈5a/12 ⌉ in (-7t+3a, 5t-2a) ⟩⌈5a/12 ≤ ⌉ ⌊3a/7⌋ ⌈5a/12⌉
⟨ ⌈5a/12 ≤ ⌉ ⌊3a/7 | ⌋⌈5a/12 ⌉ ⟩
And/Or Search for Rule Applications
39
1
C
A
5
4
2
3
F
E
D
G
B
6
7
H
J
…
…
……
…
Driven by cost- For rule applications: size of
term contributed to program.- For (sub)problems: estimate
based on variables and boolean structure.
Synthesis problem
Rule application
Synthesis in http://lara.epfl.ch/leon
Techniques used:– Leon’s verification capabilities– synthesis for theory of trees– recursion schemas– case splitting– symbolic exploration of the
space of programs– synthesis based on type
inhabitation– fast falsification using previous
counterexamples– learning conditional
expressions– cost-based search over possible
synthesis steps
Generating Expression Terms• What do we do with problems that:– do not fall in a well-defined, synthesizable subset,– do not get simplified by decomposition?
• Use counter-example guided inductive synthesis to search over small expressions
• Two algorithms– use SMT solvers to enumerate terms and evaluate
them to find new blocking clauses– type-based enumeration (Gvero,Piskac,Kuraj)
combined with discovery of preconditions41
Approaches and Their Guarantees
a) Check assertion while program p runs: C(i,p(i))
c) Constraint programming: once i is known, find o to satisfy a given constraint: find o such that C(i,o)
both specification C and program p are given:
only specification C is given:
b) Verify that program always meets spec:
i. C(i,p(i))
d) Synthesis: solve C symbolically to obtain program p that is correct by construction, for all inputs: find p such that i.C(i,p(i)) i.e. p Crun-time compile-time
Synthesis and Constraint SolvingIf we did not find an expression that solves it in all cases, we emit a runtime call to solverResult: solver invoked only in some cases– for some components of result– for some conditions on inputs
1
C
A
5
4
2
3
F
E
D
G
B
6
7
H
J
…
…………
after timeout, close the remaining branches by inserting a runtime solver call
Example Data Structure with Cachecase class CTree(cache : Int, data : Tree)def inv(ct : CTree) : Boolean = isRBT(data) && (ct.data = Empty || content(ct.data) contains ct.cache) def member(v : Int, ct : CTree) : Boolean = { require(inv(ct)) choose( (x:Boolean) => x == (content(ct.data) contains v)) }
ADT and equality split, one point rule, simplificationsdef member(v : Int, ct : CTree) : Boolean = { require(inv(ct)) ct.data match { case n:Node => if (ct.cache == v) true else choose( (x:Boolean) => x == (content(ct.data) contains v)) case Empty => false }
Synthesis did not solve fully but optimized spec for 2 common cases
From In-Memory to External Sorting
• transformation rules for monad algebra of nested sequences• exploration of equivalent algorithms through
performance estimation w/ non-linear constraint solving
Ioannis Klonatos Christoph Koch Andres Nötzli Andrej SpielmannSynthesis of Out-of-Core Algorithms (SIGMOD 2013)
in-memory sort external 2k-way merge sort with blocking
C implementation
treeFold[2k]([], unfoldR( funcPow[k](mrg))
Real-World Reasoning Gap between floating points and reality– input measurement error– floating-point round-off error– numerical method error
x<y need not mean x*<y*Automated verification tools to • compute upper error bound• generate code to match mathApplied to code fragments for• embedded systems (car,train)• physics simulations OOPSLA'11,RV'12, EMSOFT'13, POPL'14
Eva Darulova
wish
11011001 0101110111011001 0101110111011001 0101110111011001 01011101
requirementformalization
conventionalcompilation
implementation (program): p
specification (constraint): C
Command
Can we help with designing specification themselves, to make programming
accessible to non-experts?
Programming by Demonstration
http://www.youtube.com/watch?v=bErU--8GRsQTry "Pong Designer" in Android Play Store
Mikael Mayer and Lomig Mégard
Describe functionality by demonstrating and modifying behaviors while the program runs
– demonstrate desired actions by moving back in time and referring to past events
– system generalizes demonstrations into rules
SPLASH Onward'13