Formally Verifying a File System: A Successful Failure

29
Formally Verifying a File System: A Successful Failure CSCI-P515/P415 Spring 2008 Michael Adams ( [email protected] ) Joseph Near ([email protected] ) Aaron Kahn ([email protected] )

description

Formally Verifying a File System: A Successful Failure. CSCI-P515/P415 Spring 2008 Michael Adams ( [email protected] ) Joseph Near ( [email protected] ) Aaron Kahn ( [email protected] ). Overview. Motivation High Level Design Approach Minor Difficulties (and their solutions) - PowerPoint PPT Presentation

Transcript of Formally Verifying a File System: A Successful Failure

Page 1: Formally Verifying a File System:  A Successful Failure

Formally Verifying a File System: A Successful Failure

CSCI-P515/P415Spring 2008

Michael Adams ([email protected])Joseph Near ([email protected])

Aaron Kahn ([email protected])

Page 2: Formally Verifying a File System:  A Successful Failure

OverviewOverview

MotivationHigh Level DesignApproachMinor Difficulties (and their solutions)Major (Fatal) Difficulty (and

explanation)The Proposed SolutionRecap/Summation

Page 3: Formally Verifying a File System:  A Successful Failure

MotivationMotivation

Our goal for this project was to attempt to formally verify a file system◦We were under the impression that this

would be a straight forward task, and as long as the abstraction was simple, there wouldn't be any major problems

Page 4: Formally Verifying a File System:  A Successful Failure

LimitationsLimitations

Are doing:◦Can take a file number, and read/write to

it◦Create/Delete files

Not doing:◦Directories◦File Names◦Permissions, Users, Groups, etc

The stuff we're not doing can be added as an abstraction on what we are doing

Page 5: Formally Verifying a File System:  A Successful Failure

DesignDesign

Develop a B-tree Structure◦The B-tree is actually serialized onto a

disk Disk represented as an array of bytes

Create the B-tree algorithms ◦insert, delete, lookup

Write the File System (read file, write file, create file, etc) algorithms in terms of the B-tree algorithms.

Page 6: Formally Verifying a File System:  A Successful Failure

ProcessProcess

Initially, we wrote the code in Scheme in order to have a fully working model of “live” code to test on, and then translated it in to PVS

In PVS, the file system was abstracted all the way down to a disk representation to allow for better simulation of real problems of writing file systems◦This turned out to be essential to our learning

the difficulties of actually verifying a file system

Page 7: Formally Verifying a File System:  A Successful Failure

Additional StructuresAdditional Structures

In addition to the B-tree, we found that these auxiliary structures were needed◦A free list◦Blocks that represent files themselves,

but are not part of the B-tree◦Single block that holds all of the pointers

to the root of the free list and the root of the B-tree (similar to a meta-data block)

Page 8: Formally Verifying a File System:  A Successful Failure

AccomplishmentsAccomplishments

B-tree in Scheme◦Thoroughly tested

Were able to successfully translate our code into PVS.

Made a number of discoveries in terms of tricks for proving the algorithms in PVS◦However, very late in the game, we

discovered a fatal limitation of how we modelled things in PVS Have ideas for overcoming the problems in the

future

Page 9: Formally Verifying a File System:  A Successful Failure

Minor ProblemsMinor Problems

In a large project, there are many minor problems that are surprisingly difficult to solve

These often require the development of a simple but non-obvious trick

We ran into and solved many of these; here is a sample of what we learned◦More detail included in report

Page 10: Formally Verifying a File System:  A Successful Failure

SearchSearchsearch(array, start, stop, val)Search through a sorted array for the

first value greater than or equal to the argument; return the position of that value

If no element is greater than the argument, return the length of the array

Unexpectedly difficult to proveMeasured induction on stop – startEnded up using max(0, stop – start)Lesson: make sure measure is well-

founded; sometimes making it well-founded works

Page 11: Formally Verifying a File System:  A Successful Failure

Well-formednessWell-formedness

Designed as part of our testing; believed to be an important part of the proof

Theory: algorithms are correct if they have the desired effect and the disk remains well-formed

Assuming a well-formed disk should give us a basis for proving correctness of our operations

Proved that a newly-formatted disk is well-formed

Partially proved that allocation preserves well-formedness

Page 12: Formally Verifying a File System:  A Successful Failure

Well-formednessWell-formedness

Realization: well-formedness is irrelevant!

Well-formedness is defined by the observer (in this case, lookup)◦lookup(key, insert(key, value, disk)) = value

If the observer can correctly interpret the data given to it, then that data is well-formed

Lesson: don't waste time proving things about well-formedness

Page 13: Formally Verifying a File System:  A Successful Failure

Proving Proving insertinsert

Many uses of let due to state-passing style◦Exponential blowup of expression size◦Sequents become pages long!

Side effects make proofs difficult◦When an object is effected, the sequent

clauses no longer apply, even if the change doesn't affect them

◦User has to prove that the sequent clauses still apply

Page 14: Formally Verifying a File System:  A Successful Failure

Main Problem: Side EffectsMain Problem: Side Effects

State Passing StyleGood for modelling state

◦ Easy to implement, familiarBad for Proving!

Page 15: Formally Verifying a File System:  A Successful Failure

The problem with side effectsThe problem with side effects

Effects Invalidate AssumptionsGiven a property about a disk, we

need to prove the same property about a modified disk

Example:◦If P(disk) then P(write_block(block, disk))◦Even if the effect does not affect P, we

have to prove that P still holds◦This makes sense: it does not hold

automatically!

Page 16: Formally Verifying a File System:  A Successful Failure

Obvious solution: Hoare Obvious solution: Hoare LogicLogicSubstitution enforces separation of

variables◦So P(x) => P(x) automatically as long as x

isn't effectedRed herring: this only helps if we use

Abstract Data TypesWe serialize our ADT into a single disk

object◦Side-effecting one part will side-effect all

parts, even if we use Hoare Logic

Page 17: Formally Verifying a File System:  A Successful Failure

Naive SolutionNaive Solution

Prove that side effecting one part of the B-Tree, Free list, etc doesn't effect assumptions about other parts of the disk

Possible, but Impractical◦For every algorithm

For every effect For every clause of the sequent

Must prove that the assumption still holds after the effect

◦A few such basic proofs were accomplished But even they were long and easy to get lost in

Page 18: Formally Verifying a File System:  A Successful Failure

What we want from a better What we want from a better solutionsolutionWe want to write ADT style codeWe want to write ADT style proofsWe want to push a button and have

◦Serialized style code◦Serialized style proofs

Is it Possible???

Page 19: Formally Verifying a File System:  A Successful Failure

What a solution would look What a solution would look likelikeSerialization Theorems

◦Example: deserialize(serialize(n)) = n◦Fairly easy to prove

Already done Even grind could do it

Proof that changing one value doesn't effect other values◦Hmm...

Page 20: Formally Verifying a File System:  A Successful Failure

Proof of effect independenceProof of effect independence

Language Run-Time for ADT is already doing this◦Objects are serialized to memory

Language Run-Time Limitations◦Language vs Programmer control of

serialization◦The Garbage collector

Known Hard Problem Bad Idea on a Hard Disk

Page 21: Formally Verifying a File System:  A Successful Failure

How to avoid GCHow to avoid GC

We don't need general GCSide-effect view:

◦Values only “modified” if only reference Or not reachable from values used in theorems

ADT view:◦Values only “allocated” if we are

“freeing” another valueSolution: ...

Page 22: Formally Verifying a File System:  A Successful Failure

Linear Types!!!

Page 23: Formally Verifying a File System:  A Successful Failure

What are Linear Types?What are Linear Types?

Objects must always have exactly one reference◦No duplication◦No erasure

No GC needed◦Look Ma, No Garbage!◦“Modifying” something is “de-alloc” plus

“alloc”Our algorithms already treat objects as

linearJust need to teach PVS to take advantage

of that

Page 24: Formally Verifying a File System:  A Successful Failure

Linear Types vs. MonadsLinear Types vs. Monads

Lost the battle of representing state to monads

Maybe could win the war for formal proofs

Pros and Cons◦Monads are more General

Non-determinism, environments, etc.

◦Linear Types provide more guarantees A reference to a linearly typed object is

guaranteed to be the only reference

Page 25: Formally Verifying a File System:  A Successful Failure

RecapRecap

File Systems are Full of Bugs◦But it is critical that they be right◦Verification could fix this

We designed and implemented a File System◦B-Tree based◦Modelled all the way to “disk”◦Auxiliary structures needed

Free List File Blocks Root File System Block

Page 26: Formally Verifying a File System:  A Successful Failure

RecapRecap

We proved linear search◦Lesson: Make sure measures are well-founded◦Lesson: Make measures well-founded if they

aren'tWell-formedness

◦Red-herring◦Actually defined by observers

Exponential blow-up due to let◦Possible Improvement in how PVS presents

sequents

Page 27: Formally Verifying a File System:  A Successful Failure

RecapRecap

Side effects are hard in an unexpected way◦Implementing side effects in PVS is easy

Use State Passing Style (e.g. State monad)

◦Proving side effects in a serialized common store is hard Must prove that every effect keeps the

theorems trueNumber of Proofs exploded beyond

our ability

Page 28: Formally Verifying a File System:  A Successful Failure

RecapRecap

Linear Types to the Rescue!!◦User writes ADT style proofs◦System converts them to serialized

proofs◦Better than Monads

Need Theory for Linear Types in PVS

Page 29: Formally Verifying a File System:  A Successful Failure

Final ResultsFinal Results

Ultimately had to declare failure◦Code is fragmentary

But learned more from failure than success◦Main deliverable is report and what not to do◦We have good ideas for how to make future

attempts... and we don't feel too bad because

others have estimated verifying a file system to take 2-3 years to accomplish.◦A mini-challenge: build a verifiable filesystem.

Rajeev Joshi, Gerard J. Holzmann