Using First-Order Theorem Provers in Data Structure Verification
description
Transcript of Using First-Order Theorem Provers in Data Structure Verification
![Page 1: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/1.jpg)
Using First-Order Theorem Provers in Data Structure Verification
Charles Bouillaguet
Ecole Normale Supérieure, Cachan, France
Viktor KuncakMartin Rinard
MIT CSAIL
![Page 2: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/2.jpg)
Implementing Data Structures is Hard
Often small, but complex code Lots of pointers Unbounded, dynamic allocation Complex shape invariants
Dag Properties involving arithmetic (ordering…)
Need strong invariants to guarantee correctness e.g. lookup in ordered tree needs sortedness
![Page 3: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/3.jpg)
How to obtain reliable data structure implementations?
Approach Prove that the program is correct For all program executions (sound)
Verified properties: Data structure operations do not crash Data structure invariants are preserved Data structure content is correctly updated
![Page 4: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/4.jpg)
Infrastructure
Jahob system for verifying data structure implementation Kuncak, Wies, Zee, Rinard, Nguyen,
Bouillaguet, Schmitt, Marnette, Bugrara
Analyzed programs: subset of Java
Specification : subset of Isabelle’s language
![Page 5: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/5.jpg)
Summary of Verified Data Structures
Implementations of relations Add a binding Remove all bindings for a given key Test key membership Retrieve data bound to a key Test emptiness
Verified implementations Linked list Ordered tree Hash table
![Page 6: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/6.jpg)
An Example : Ordered Trees
Implementation of a finite map Operations
insert lookup remove
Representation invariants: tree shaped (acyclicity, unique
parent) ordering constraints
leftright
keyvalue
![Page 7: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/7.jpg)
public static FuncTree update(int k, Object v, FuncTree t) { FuncTree new_left, new_right; Object new_data; int new_key; if (t==null) { new_data = v; new_key = k; new_left = null; new_right = null; } else { if (k < t.key) { new_left = update(k, v, t.left); new_right = t.right; new_key = t.key; new_data = t.data; } else if (t.key < k) { … } else { new_data = v; new_key = k; new_left = t.left; new_right = t.right; } } FuncTree r = new FuncTree(); r.left = new_left; r.right = new_right; r.data = new_data; r.key = new_key; return r;}
Sample code
![Page 8: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/8.jpg)
public static FuncTree update(int k, Object v, FuncTree t)/*: requires "v ~= null” ensures "result..content = t..content - {(x,y). x=k} + {(k,v)} */{ FuncTree new_left, new_right; Object new_data; int new_key; if (t==null) { new_data = v; new_key = k; new_left = null; new_right = null; } else { if (k < t.key) { new_left = update(k, v, t.left); new_right = t.right; new_key = t.key; new_data = t.data; } else if (t.key < k) { … } else { new_data = v; new_key = k; new_left = t.left; new_right = t.right; } } FuncTree r = new FuncTree(); r.left = new_left; r.right = new_right; r.data = new_data; r.key = new_key; //: "r..content" := "t..content - {(x,y). x=k} + {(k,v)}";
return r;
}
3 lines spec30 lines code
no null dereferences
Sample code
postcondition holds and invariants preserved
![Page 9: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/9.jpg)
Ordered tree: interfacepublic ghost specvar content :: "(int * obj) set" = "{}";
public static FuncTree empty_set()ensures "result..content = {}"
public static FuncTree add(int k, Object v, FuncTree t)requires "v ~= null & (ALL y. (k,y) ~: t..content)”ensures "result..content = t..content Un {(k,v)}”
public static FuncTree update(int k, Object v, FuncTree t)requires "v ~= null”ensures "result..content = t..content - {(x,y). x=k} + {(k,v)}”
public static Object lookup(int k, FuncTree t) ensures "((k, result) : t..content) | (result = null & (ALL v. (k,v) ~: t..content))”
public static FuncTree remove(int k, FuncTree t)ensures "result..content = t..content - {(x,y). x=k}”
![Page 10: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/10.jpg)
Representation Invariantspublic final class FuncTree{
private int key;private Object data;private FuncTree left, right;
/*: public ghost specvar content :: "(int * obj) set"; invariant ("content definition") "this ~= null --> content = {(key, data)} Un left..content Un right..content"
invariant ("null implies empty") "this = null --> content = {}"
invariant ("left children are smaller") "ALL k v. (k,v) : left..content --> k < key”
invariant ("right children are bigger") "ALL k v. (k,v) : right..content --> k > key"
*/
equality between sets
implicit universal quantification over this
explicit quantification
arithmetic
tuples
abstract set-valued field
![Page 11: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/11.jpg)
How could these properties be verified?
![Page 12: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/12.jpg)
Standard Approach
Transform program into a logic formula Using weakest precondition The program is correct iff the formula is valid
Prove the formula Very difficult formulas: interactively (Coq, Isabelle) Decidable classes: automated (MONA, CVCL, Omega) This talk: difficult formulas in automated way :)
eauto ; intros . intuition ; subst . apply Extensionality_Ensembles. unfold Same_set. unfold Included. unfold In. unfold In in H1.intuition. destruct H0. destruct (eq_nat_dec x1 ArraySet_size).subst. rewrite arraywrite_match in H0 ; auto. intuition. subst. apply Union_intror. auto with sets. assert (x1 < ArraySet_size). omega. clear n. apply Union_introl. rewrite arraywrite_not_same_i in H0.unfold In. exists x1. intuition.omega.
inversion H0 ; subst ; clear H0. unfold In in H3. destruct H3. exists x1. intuition. rewrite arraywrite_not_same_i. intuition ; omega. omega. exists ArraySet_size. intuition. inversion H3. subst. rewrite arraywrite_match ; trivial.
low efficiency 1 line per grad student-minute parallelization looks non-trivial
![Page 13: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/13.jpg)
Formulas in Jahob
Very expressive specification language Higher-Order features
How to prove formulas automatically?
Convert them to something simpler Decidable classes First-Order Logic
![Page 14: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/14.jpg)
Automated reasoning in Jahob
![Page 15: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/15.jpg)
Why FOL?
Existing theorem provers: SPASS, E, Vampire, Theo, Prover9, … continuously improving (yearly competition)
Effective on formulas with short proofs
Handle nicely formulas with quantifiers
![Page 16: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/16.jpg)
HOL FOL Ideas :
avoid axiomatizing rich theories Translate what can naturally be expressed in FOL soundly approximate the rest Sound, incomplete approach
Full details in long version of the paper (x,y) z.content ∈ Content(x,y,z)⇢ w.f := y ⇢(x=y ⋀ w=v) ⋁ (x ≠ y ⋀ w=f(y) )
λx.E = λx.F⇢ x. E=F∀ …
![Page 17: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/17.jpg)
Arithmetic
Numbers are uninterpreted constants in FOL Provers do not know that 1+1=2 ! Still need to reason about arithmetic…
Our Solution Provide partial, incomplete axiomatization
• Still cannot deduce 1+1=2 !
• comparison between constants in formula
Satisfactory results in practice• ordering of elements in tree• array bound checks
![Page 18: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/18.jpg)
Observation
Most formulas are easy to prove ie in no measurable time have very short proofs (in # of resolution step)
Problem often concentrated in a small number that take very long to prove
We applied two existing techniques to make them easier1. Eliminating type/sort information2. Filtering unnecessary assumptions
![Page 19: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/19.jpg)
Sort Information
Specification language has sorts Integers Objects Boolean
Translate to unsorted FOL
∀(x : Obj). P(x)
⇣∀x. Obj(x) P(x)⇒
![Page 20: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/20.jpg)
Sort Information
Encoding sort information bigger formulas longer proofs
Formulas become harder to prove
Temptation to omit sort information…
![Page 21: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/21.jpg)
Effect on hard formulas Formulas that take more than 1s to prove, from
the Tree implementation (SPASS)
BenchmarkTime (s) Proof length
Generated clauses
with w/o with w/o with w/o
Tree.remove
5.3 1.1 799 155 18 376 9 425
3.6 0.3 1 781 309 19 601 1 917
9.8 4.9 1 781 174 33 868 27 108
8.1 0.5 1 611 301 31 892 3 922
8.1 4.7 1 773 371 37 244 28 170
7.9 0.3 1 391 308 41 354 3 394
Tree.remove_max
+∞ 0.22 97 1 075
78.9 6.8 2 655 1 159 177 755 19 527
34.8 0.8 4 062 597 115 713 5 305
![Page 22: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/22.jpg)
Omitting Sorts (cont’d)
Great speed-up (more than x10 sometimes) ! However:
∀ (x y:S). x = y
∃ (x y:T). x ≠ y
Satisfiable with sorts (S={a}, T={b,c})…
Unsatisfiable without!
Omitting sort guards breaks soundness!!!
Possible workaround: type-check generated proof
When it is possible to skip type-checking ?
![Page 23: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/23.jpg)
Omitting Sorts Result
We proved the following
Theorem. Suppose thati. Sorts are pair-wise disjoint (no sub-sorting)
ii. Sorts have the same cardinality
Then omitting sort guards is
sound and complete
This justify this useful optimization
![Page 24: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/24.jpg)
Assumption Filtering
Provers get confused by too many assumptions
Lots of useless assumptions Hardest shown benchmark needs 12 out of 56 Big benchmark: on average 33% necessary
Assumption filtering Try to eliminate irrelevant assumptions automatically Give a score to assumption based on relevance
![Page 25: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/25.jpg)
Experimental results
Benchmarklines of
codelines of
specification# of
methodsverif. time
Sets as imperative list
60 24 9 18s
Relation as functional Linked list
76 26 9 12s
Relation as functional Ordered trees
186 38 10 178s
Relation as hash table (using f.list)
69 53 10 119s
![Page 26: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/26.jpg)
Verification effort
Decreased as we improved the system functional list was easy a few days for trees two hours for simple hash table
FOL : Currently most usable method for these kind of data structures
![Page 27: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/27.jpg)
Related work
Interactive Provers – Isabelle, Coq, HOL, PVS, ACL2 First-Order ATP
Vampire – Voronkov [04] SPASS – Weidenbach [01] E – Shultz [IJCAR04]
Program Checking ESC/Java2 – Kiniry, Chalin, Hurlin Krakatoa – Marche, Paulin-Mohring, Urbain [03] Spec# – Barnett, DeLine, Jacobs, Fähndrich, Leino, Schulte, Venter
[05] Hob system: verify set implementations (we verify relations)
Shape analysis PALE - Møller and Schwartzbach [PLDI01] TVLA - Sagiv, Reps, and Wilheim [TOPLAS02] Roles - Kuncak, Lam, and Rinard [POPL02]
![Page 28: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/28.jpg)
Multiple Provers - Screenshot
![Page 29: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/29.jpg)
Conclusion
Jahob verification system Automation by translation HOLFOL
omitting sorts theorem gives speedup filtering automates selection of assumptions
Promising experimental results strong properties: correct implementation
• Do not crash• operations correctly update the content, clarifies behavior in
case of duplicate keys, …• representation invariants preserved (ordering, treeness, each
element is in appropriate bucket) relatively fast verification effort much smaller than using interactive
provers
![Page 30: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/30.jpg)
Thank you
Formal Methods are the Future of computer Science.
Always have been…
Always will be.
Questions ?
![Page 31: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/31.jpg)
Converting to GCL
Conditionnal statement: easy [| if cond then tbranch else fbranch |] =
(Assume cond; [| tbranch|] ) (Assume !cond; [| fbranch|] )
Procedure calls: Could inline (potentially exponential blowup) Desugaring (modularity) :
• [| r = CALL m(x, y, z) |] = Assert (m’s precondition);
Havoc r;
Havoc {vars modified by m} ;
Assume (m’s postcondition)
![Page 32: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/32.jpg)
Loops: invariant required [| while /*: invariant */ (condition) {lbody} |] =
assert invariant;
havoc vars(lbody);
assume invariant;
((assume condition;
[| lbody |];
assert invariant;
assume false)
(assume !condition))
Converting to GCL (cont’d)
invariant hold initially
no assumptions on variables except that invariant hold
condition hold
or condition do not hold and execution continues
invariant is preserved
no need to verify anything more
![Page 33: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/33.jpg)
Verification condition for remove ((((fieldRead Pair_data null) = null) & ((fieldRead FuncTree_data null) = null) & ((fieldRead
FuncTree_left null) = null) & ((fieldRead FuncTree_right null) = null) & (ALL (xObj::obj). (xObj : Object)) & ((Pair Int FuncTree) = {null}) & ((Array Int FuncTree) = {null}) & ((Array Int Pair) = {null}) & (null : Object_alloc) & (pointsto Pair Pair_data Object) & (pointsto FuncTree FuncTree_data Object) & (pointsto FuncTree FuncTree_left FuncTree) & (pointsto FuncTree FuncTree_right FuncTree) & comment ''unalloc_lonely'' (ALL (x::obj). ((x ~: Object_alloc) --> ((ALL (y::obj). ((fieldRead Pair_data y) ~= x)) & (ALL (y::obj). ((fieldRead FuncTree_data y) ~= x)) & (ALL (y::obj). ((fieldRead FuncTree_left y) ~= x)) & (ALL (y::obj). ((fieldRead FuncTree_right y) ~= x)) & ((fieldRead Pair_data x) = null) & ((fieldRead FuncTree_data x) = null) & ((fieldRead FuncTree_left x) = null) & ((fieldRead FuncTree_right x) = null)))) & comment ''ProcedurePrecondition'' (True & comment ''FuncTree_PrivateInv content definition'' (ALL (this::obj). (((this : Object_alloc) & (this : FuncTree) & ((this :: obj) ~= null)) --> ((fieldRead (FuncTree_content :: (obj => ((int * obj)) set)) (this :: obj)) = (({((fieldRead (FuncTree_key :: (obj => int)) (this :: obj)), (fieldRead (FuncTree_data :: (obj => obj)) (this :: obj)))} Un (fieldRead (FuncTree_content :: (obj => ((int * obj)) set)) (fieldRead (FuncTree_left :: (obj => obj)) (this :: obj)))) Un (fieldRead (FuncTree_content :: (obj => ((int * obj)) set)) (fieldRead (FuncTree_right :: (obj => obj)) (this :: obj))))))) & comment ''FuncTree_PrivateInv null implies empty'' (ALL (this::obj). (((this : Object_alloc) & (this : FuncTree) & ((this :: obj) = null)) --> ((fieldRead (FuncTree_content :: (obj => ((int * obj)) set)) (this :: obj)) = {}))) & comment ''FuncTree_PrivateInv no null data'' (ALL (this::obj). (((this : Object_alloc) & (this : FuncTree) & ((this :: obj) ~= null)) --> ((fieldRead (FuncTree_data :: (obj => obj)) (this :: obj)) ~= null))) & comment ''FuncTree_PrivateInv left children are smaller'' (ALL (this::obj). (((this : Object_alloc) & (this : FuncTree)) --> (ALL k. (ALL v. (((k, v) : (fieldRead (FuncTree_content :: (obj => ((int * obj)) set)) (fieldRead (FuncTree_left :: (obj => obj)) (this :: obj)))) --> (intless k (fieldRead (FuncTree_key :: (obj => int)) (this :: obj)))))))) & comment ''FuncTree_PrivateInv right children are bigger'' (ALL (this::obj). (((this : Object_alloc) & (this : FuncTree)) --> (ALL k. (ALL v. (((k, v) : (fieldRead (FuncTree_content :: (obj => ((int * obj)) set)) (fieldRead (FuncTree_right :: (obj => obj)) (this :: obj)))) --> ((fieldRead (FuncTree_key :: (obj => int)) (this :: obj)) < k))))))) & comment ''t_type'' (((t :: obj) : (FuncTree :: obj set)) & ((t :: obj) : (Object_alloc :: obj set)))) --> ((comment ''TrueBranch'' (((t :: obj) = null) :: bool) --> (comment ''ProcedureEndPostcondition'' ((((fieldRead (FuncTree_content :: (obj => ((int * obj)) set)) (null :: obj)) = ((fieldRead (FuncTree_content :: (obj => ((int * obj)) set)) (t :: obj)) - {p. (EX x y. ((p = (x, y)) & (x = (k :: int))))})) & (ALL (framedObj::obj). (((framedObj : Object_alloc) & (framedObj : FuncTree)) --> ((fieldRead FuncTree_content framedObj) = (fieldRead FuncTree_content framedObj))))) & comment ''FuncTree_PrivateInv content definition'' (ALL (this::obj). (((this : Object_alloc) & (this : FuncTree) & ((this :: obj) ~= null)) --> ((fieldRead (FuncTree_content :: (obj => ((int * obj)) set)) (this :: obj)) = (({((fieldRead (FuncTree_key :: (obj => int)) (this :: obj)), (fieldRead (FuncTree_data :: (obj => obj)) (this :: obj)))} Un (fieldRead (FuncTree_content :: (obj => …
And 200 more kilobytes…
Infeasible to prove directly
![Page 34: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/34.jpg)
Splitting heuristic
Verification condition is big conjunction conjunctions in postcondition proving each invariant proving each branch in program
Solution: split VC into individual conjuncts Prove each conjunct separately Each conjunct has form
H1 /\ … /\ Hn Gi
Tree.Remove has 230 such conjuncts How do we prove them?
![Page 35: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/35.jpg)
Detupling (cont’d)
Complete rules:
![Page 36: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/36.jpg)
Handling of Fields (cont’d)
We dealt with field updates New function expressed in terms of old one
Base case: field variables Natural encoding in FOL using functions:
x = y.f ! x = f(y)
![Page 37: Using First-Order Theorem Provers in Data Structure Verification](https://reader035.fdocuments.net/reader035/viewer/2022062410/56815223550346895dc06978/html5/thumbnails/37.jpg)
Future work
Verify more examples balanced trees fancy priority queues (binomial, Fibonacci, …) hash table with dynamic resizing
hash function verify clients of data structures Improve assumption filtering
take rarity of symbols into account check for occurring polarity …