Bit Vector Decision Procedures A Basis for Reasoning about Hardware & Software bryant Randal E....
-
date post
19-Dec-2015 -
Category
Documents
-
view
220 -
download
2
Transcript of Bit Vector Decision Procedures A Basis for Reasoning about Hardware & Software bryant Randal E....
Bit Vector Decision Bit Vector Decision ProceduresProcedures
A Basis for Reasoning about A Basis for Reasoning about Hardware & SoftwareHardware & Software
Bit Vector Decision Bit Vector Decision ProceduresProcedures
A Basis for Reasoning about A Basis for Reasoning about Hardware & SoftwareHardware & Software
http://www.cs.cmu.edu/~bryant
Randal E. BryantCarnegie Mellon University
– 2 –
CollaboratorsCollaborators
Sanjit Seshia, Bryan BradySanjit Seshia, Bryan Brady UC Berkeley
Daniel KroeningDaniel Kroening ETH Zurich
Joel OuaknineJoel Ouaknine Oxford University
Ofer StrichmanOfer Strichman Technion
Randy BryantRandy Bryant Carnegie Mellon
3 continents4 countries5 institutions6 authors
– 3 –
Motivating Example #1Motivating Example #1
Do these functions produce identical results?Do these functions produce identical results?
StrategyStrategy Represent and reason about bit-level program behavior Specific to machine word size, integer representations,
and operations
int abs(int x) { int mask = x>>31; return (x ^ mask) + ~mask + 1;}
int test_abs(int x) { return (x < 0) ? -x : x; }
– 4 –
Motivating Example #2Motivating Example #2
Is there an input string that causes value 234 to be Is there an input string that causes value 234 to be written to address awritten to address a44aa33aa22aa11??
void fun() { char fmt[16]; fgets(fmt, 16, stdin); fmt[15] = '\0'; printf(fmt);}
AnswerAnswer Yes: "a1a2a3a4%230g%n"
Depends on details of compilation
But no exploit for buffer size less than 8 [Ganapathy, Seshia, Jha, Reps, Bryant, ICSE ’05]
– 5 –
Motivating Example #3Motivating Example #3
Is there a way to expand the program sketch to Is there a way to expand the program sketch to make it match the spec?make it match the spec?
bit[W] popSpec(bit[W] x){ int cnt = 0; for (int i=0; i<W; i++) { if (x[i]) cnt++; } return cnt;}
AnswerAnswer W=16:
[Solar-Lezama, et al., ASPLOS ‘06]
bit[W] popSketch(bit[W] x){ loop (??) { x = (x&??) + ((x>>??)&??); } return x;}
x = (x&0x5555) + ((x>>1)&0x5555); x = (x&0x3333) + ((x>>2)&0x3333); x = (x&0x0077) + ((x>>8)&0x0077); x = (x&0x000f) + ((x>>4)&0x000f);
– 6 –
Motivating Example #4Motivating Example #4
Is pipelined microprocessor identical to sequential reference model?Is pipelined microprocessor identical to sequential reference model?
StrategyStrategy Automatically generate abstraction function from pipeline to program state
[Burch & Dill, CAV ’94] Represent machine instructions, data, and state as bit vectors
Compatible with hardware description language representation
Reg.File
IF/ID
InstrMem
+4
PCID/EX
ALU
EX/WB
=
=
Rd
Ra
Rb
Imm
Op
Adat
Control Control
Reg.File
IF/ID
InstrMem
+4
PCID/EX
ALU
EX/WB
=
=
Rd
Ra
Rb
Imm
Op
Adat
Control Control
Reg.File
InstrMem
+4
ALU
Rd
Ra
Rb
Imm
Op
Adat
Control
Bdat
Reg.File
InstrMem
+4
ALU
Rd
Ra
Rb
Imm
Op
Adat
Control
Bdat
Pipelined Microprocessor Sequential Reference Model
– 7 –
TaskTask
Bit Vector FormulasBit Vector Formulas Fixed width data words Arithmetic operations
E.g., add/subtract/multiply/divide & comparisonsTwo’s complement, unsigned, …
Bit-wise logical operationsE.g., and/or/xor, shift/extract and equality
Boolean connectives
Reason About Hardware & Software at Bit LevelReason About Hardware & Software at Bit Level Formal verification Security analysis Test & program generation
What function arguments will cause program to take specified branch?
– 8 –
Decision ProceduresDecision Procedures Core technology for formal reasoning
Boolean SATBoolean SAT Pure Boolean formula
SAT Modulo Theories (SMT)SAT Modulo Theories (SMT) Support additional logic fragments Example theories
Linear arithmetic over reals or integersFunctions with equalityBit vectorsCombinations of theories
FormulaFormulaDecision
Procedure
Satisfying solution
Unsatisfiable(+ proof)
– 9 –
Recent Progress in SAT SolvingRecent Progress in SAT Solving
766
147 118 81 46
3600
0
1,000
2,000
3,000
Gra
sp (2
000)
zChaf
f (200
1)
BerkM
in (2
002)
zChaf
f (200
3-04
)
Siege
(200
4)
SatElit
eGTI (
2005
)
Ru
n-t
ime
(sec
.)
– 10 –
BV Decision Procedures:Some HistoryBV Decision Procedures:Some HistoryB.C. (Before Chaff)B.C. (Before Chaff)
String operations (concatenate, field extraction) Linear arithmetic with bounds checking Modular arithmetic
SAT-Based “Bit Blasting”SAT-Based “Bit Blasting” Generate Boolean circuit based on bit-level behavior of
operations Convert to Conjunctive Normal Form (CNF) and check with
best available SAT checker Handles arbitrary operations Effective in many applications
CBMC [Clarke, Kroening, Lerda, TACAS ’04]Microsoft Cogent + SLAM [Cook, Kroening, Sharygina, CAV ’05]CVC-Lite [Dill, Barrett, Ganesh], Yices [deMoura, et al], STP
– 11 –
Research ChallengeResearch Challenge
Is there a better way than bit blasting?Is there a better way than bit blasting?
RequirementsRequirements Provide same functionality as with bit blasting Find abstractions based on word-level structure Improve on performance of bit blasting
A New ApproachA New Approach [Bryant, Kroening, Ouaknine, Seshia, Stichman, Brady,
TACAS ’07] Use bit blasting as core technique Apply to simplified versions of formula Successive approximations until solve or show unsatisfiable
– 12 –
Approximating FormulaApproximating Formula
Example Approximation TechniquesExample Approximation Techniques Underapproximating
Restrict word-level variables to smaller ranges of values
Overapproximating Replace subformula with Boolean variable
Original Formula
+Overapproximation + More solutions:
If unsatisfiable, then so is
Underapproximation−
−
Fewer solutions:Satisfying solution also satisfies
– 13 –
Starting IterationsStarting Iterations
Initial UnderapproximationInitial Underapproximation (Greatly) restrict ranges of word-level variables Intuition: Satisfiable formula often has small-domain solution
1−
– 14 –
First Half of IterationFirst Half of Iteration
SAT Result for SAT Result for 1− Satisfiable
Then have found solution for Unsatisfiable
Use UNSAT proof to generate overapproximation 1+ (Described later)
1−If SAT, then done
1+
UNSAT proof:generate overapproximation
– 15 –
Second Half of IterationSecond Half of Iteration
SAT Result for SAT Result for 1+ Unsatisfiable
Then have shown unsatisfiable
Satisfiable Solution indicates variable ranges that must be expanded Generate refined underapproximation
1−
If UNSAT, then done1+
SAT:Use solution to generate refined underapproximation
2−
– 16 –
Iterative BehaviorIterative Behavior
UnderapproximationsUnderapproximations Successively more precise
abstractions of Allow wider variable ranges
OverapproximationsOverapproximations No predictable relation UNSAT proof not unique
1−
1+
2−
k−
2+
k+
– 17 –
Overall EffectOverall Effect
SoundnessSoundness Only terminate with solution on
underapproximation Only terminate as UNSAT on
overapproximation
CompletenessCompleteness Successive underapproximations
approach Finite variable ranges guarantee
termination In worst case, get k−
1−
1+
2−
k−
2+
k+
SAT
UNSAT
– 18 –
Generating OverapproximationGenerating Overapproximation
GivenGiven Underapproximation 1−
Bit-blasted translation of 1− into Boolean formula
Proof that Boolean formula unsatisfiable
GenerateGenerate Overapproximation 1+
If 1+ satisfiable, must lead to refined underapproximation
Generate 2− such that 1−
2−
1−
1+
UNSAT proof:generate overapproximation
2−
– 19 –
Bit-Vector Formula StructureBit-Vector Formula Structure
DAG representation to allow shared subformulas
x + 2 z 1
x % 26 = v
w & 0xFFFF = x
x = y
Ç
Æ:
Ç
Æ
Ç
a
– 20 –
Structure of UnderapproximationStructure of Underapproximation
Linear complexity translation to CNFEach word-level variable encoded as set of Boolean variablesAdditional Boolean variables represent subformula values
x + 2 z 1
x % 26 = v
w & 0xFFFF = x
x = y
Ç
Æ:
Ç
Æ
Ç
a −
RangeConstraints
wxyz
Æ
– 21 –
Encoding Range ConstraintsEncoding Range ConstraintsExplicitExplicit
View as additional predicates in formula
ImplicitImplicit Reduce number of variables in encoding
Constraint Encoding
0 w 8 0 0 0 ··· 0 w2w1w0
−4 x 4 xsxsxs··· xsxsx1x0
Yields smaller SAT encodings
RangeConstraints
w
x0 w 8 −−44 x x 4 4
– 22 –
UNSAT ProofUNSAT Proof Subset of clauses that is unsatisfiable Clause variables define portion of DAG Subgraph that cannot be satisfied with given range
constraints
x + 2 z 1
x % 26 = v
w & 0xFFFF = x
x = y
a
Ç
Æ
Æ
Ç
Ç
:
RangeConstraints
wxyz
Æ
– 23 –
Generated OverapproximationGenerated Overapproximation
Identify subformulas containing no variables from UNSAT proof Replace by fresh Boolean variables Remove range constraints on word-level variables Creates overapproximation
Ignores correlations between values of subformulas
x + 2 z 1
x = y
a Æ
Æ
Ç
Ç
:
b1
b2
1+
– 24 –
Refinement PropertyRefinement PropertyClaimClaim
1+ has no solutions that satisfy 1−’s range constraintsBecause 1+ contains portion of 1− that was shown to be
unsatisfiable under range constraints
x + 2 z 1
x = y
a Æ
Æ
Ç
Ç
:
b1
b2
RangeConstraints
wxyz
ÆUNSAT
1+
– 25 –
Refinement Property (Cont.)Refinement Property (Cont.)
ConsequenceConsequence Solving 1+ will expand range of some variables
Leading to more exact underapproximation 2−
x + 2 z 1
x = y
a Æ
Æ
Ç
Ç
:
b1
b2
1+
– 26 –
Effect of IterationEffect of Iteration
Each Complete IterationEach Complete Iteration Expands ranges of some word-level variables Creates refined underapproximation
1−
1+
SAT:Use solution to generate refined underapproximation
2−
UNSAT proof:generate overapproximation
– 27 –
Approximation MethodsApproximation Methods
So FarSo Far Range constraints
Underapproximate by constraining values of word-level variables
Subformula eliminationOverapproximate by assuming subformula value arbitrary
General RequirementsGeneral Requirements Systematic under- and over-approximations Way to connect from one to another
Goal: Devise Additional Approximation StrategiesGoal: Devise Additional Approximation Strategies
– 28 –
Function Approximation ExampleFunction Approximation Example
MotivationMotivation Multiplication (and division) are difficult cases for SAT
§: Prohibited: Prohibited Gives underapproximation Restricts values of (possibly intermediate) terms
§: : ff ((xx,,yy)) Overapproximate as uninterpreted function f Value constrained only by functional consistency
*x
y
x
0 1 else
y
0 0 0 0
1 0 1 x
else 0 y §
– 29 –
Results: UCLID BV vs. Bit-blastingResults: UCLID BV vs. Bit-blasting
UCLID always better than bit blasting Generally better than other available procedures SAT time is the dominating factor
[results on 2.8 GHz Xeon, 2 GB RAM]
– 30 –
UCLID BV run-time analysisUCLID BV run-time analysis
wi: Maximum word-level variable size
si: Maximum word-level variable instantiation
Generated abstractions are small Few iterations of refinement loop
needed
– 31 –
Why This Work is WorthwhileWhy This Work is Worthwhile
Realistic Semantic Model for Hardware and SoftwareRealistic Semantic Model for Hardware and Software Captures all details of actual operation
Detects errors related to overflow and other artifacts of finite representation
Allows mixing of integer and bit-level operations Can capture many abstractions that are currently applied
manually
SAT-Based Methods Are Only Logical ChoiceSAT-Based Methods Are Only Logical Choice Bit blasting is only way to capture full set of operations SAT solvers are good & getting better
Abstraction / Refinement Allows Better ScalingAbstraction / Refinement Allows Better Scaling Take advantage of cases where formula easily satisfied or
disproven
– 32 –
Future WorkFuture Work
Lots of Refinement & TuningLots of Refinement & Tuning Selecting under- and over-approximations Iterating within under- or over-approximation
E.g., attempt to control variable ranges when overapproximating
Reusing portions of bit-blasted formulasTake advantage of incremental SAT
Additional AbstractionsAdditional Abstractions More general use of functional abstraction
Subsume use of uninterpreted functions in current verification methods