Syntax-Guided Synthesis
description
Transcript of Syntax-Guided Synthesis
Syntax-Guided Synthesis
Rajeev Alur
Joint work with R.Bodik, G.Juniwal, M.Martin, M.Raghothaman, S.Seshia, R.Singh, A.Solar-Lezama, E.Torlak, A.Udupa
1
Program Verification
Does a program P meet its specification j, where j is written as a logical formula?
Motivation: Correctness of systems, finding bugs
Program verification is hard!
Formalizing a structured program into logical formulas
Using tools (SMT solvers) to verify whether the formalized program meets its specification.
SMT-LIB – common standards and library of benchmarks of SMT solvers.
2
Program Synthesis Automatically synthesize a program P that satisfies a given
specification j
Can potentially have greater impact than program verification
Program synthesis is hard!
Let’s provide a syntactic template for the program – Syntax-Guided Synthesis (SyGuS)
Works on special cases already exist (e.g. Sketch 2008)
Let’s build a common standard and benchmarks for SyGuS solvers (SYNTH-LIB)
3
Talk Outline
Background: SMT Solvers
Formalization of SyGuS
Solution Strategies
Conclusions + SyGuS Competition
4
What is SMT?
Satisfiability Modulo Theories
+
Magnus Madsen
Recall SAT
The Boolean SATisfiability Problem:
• A=TRUE, =FALSE, =FALSE
literal or negated literal
Magnus Madsen
Recall SAT
• SAT is NP-complete (solveable in exponential time)
• Many SAT solvers exist – DPLL (1962) – Chaff (2001)– MiniSAT (2004)
• Some do remarkably well.Magnus Madsen
What is an SMT instance?
A logical formula built using– negation, conjunction and disjuction
• e.g. • e.g.
– theory specific operators• e.g. , • e.g.
theory of integers
theory of bitwise
operators
Magnus Madsen
Q: Why not encode every
formula in SAT?A: Theory
solvers have very efficient
algorithmsGraph Problems:
• Shortest-Path• Minimum Spanning Tree
Optimization:• Max-Flow• Linear Programming
(just to name a few)Magnus Madsen
Q: But then, Why not get rid
of the SAT solver?
A: SAT solvers are being
studied for a long time
Magnus Madsen
SAT Theory
Formula
YES
𝑥≥3∧ (𝑥≤0∨ 𝑦 ≥0 )
𝑎∧ (𝑏∨𝑐 )
𝑎∧𝑏
NO
add clause:
𝑎∧𝑐
𝑥≥3∧𝑥≤0𝑥≥3∧ 𝑦 ≥0
YES
SMT Solver
Magnus Madsen
Theories
Theory of:– Difference Arithemetic– Linear Arithmetic– Arrays– Bit Vectors– Algebraic Datatypes– Uninterpreted Functions
Magnus Madsen
C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 13
Equivalence Checking of Program Fragmentsint fun1(int y) { int x, z; z = y; y = x; x = z;
return x*x;}
int fun2(int y) { return y*y;} What if we use SAT to check equivalence?
SMT formula Satisfiable iff programs non-equivalent
( z = y y1 = x x1 = z ret1 = x1*x1) ( ret2 = y*y ) ( ret1 ret2 )
C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 14
Equivalence Checking of Program Fragmentsint fun1(int y) { int x, z; z = y; y = x; x = z;
return x*x;}
int fun2(int y) { return y*y;}
SMT formula Satisfiable iff programs non-equivalent
( z = y y1 = x x1 = z ret1 = x1*x1) ( ret2 = y*y ) ( ret1 ret2 )
Using SAT to check equivalence (w/ Minisat) 32 bits for y: Did not finish in over 5 hours 16 bits for y: 37 sec. 8 bits for y: 0.5 sec.
C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 15
Equivalence Checking of Program Fragmentsint fun1(int y) { int x, z; z = y; y = x; x = z;
return x*x;}
int fun2(int y) { return y*y;}
SMT formula ’
( z = y y1 = x x1 = z ret1 = sq(x1) ) ( ret2 = sq(y) ) ( ret1 ret2 )
Using EUF solver: 0.01 sec
Verification Synthesis
16
Program Verification: Does P meet spec j ?
SMT: Is j satisfiable ?
SMT-LIB/SMT-COMP Standard API Solver competition
Program Synthesis: Find P that meets spec j
Syntax-Guided Synthesis
Plan for SyGuS-comp
Talk Outline
Formalization of SyGuS
Solution Strategies
Conclusions + SyGuS Competition
17
Syntax-Guided Synthesis (SyGuS) Problem Fix a background theory T: fixes types and operations
Function to be synthesized: name f along with its type
Inputs to SyGuS problem:Specification j (semantic constraint)
Typed formula using symbols in T + symbol f
Set E of expressions given by a context-free grammarSet of candidate expressions that use symbols in T
(syntactic constraint)
Computational problem: Output e in E such that j[f/e] is valid (in theory T)
18
SyGuS Example Theory QF-LIA
Types: Integers and BooleansLogical connectives, Conditionals, and Linear arithmeticQuantifier-free formulas
Function to be synthesized f (int x, int y) : int
Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)
Candidate Implementations: Linear expressionsLinExp := x | y | Const | LinExp + LinExp | LinExp - LinExp
No solution exists
19
SyGuS Example Theory QF-LIA
Function to be synthesized: f (int x, int y) : int
Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)
Candidate Implementations: Conditional expressions with comparisons
Term := x | y | Const | If-Then-Else (Cond, Term, Term)Cond := Term <= Term | Cond & Cond | ~ Cond | (Cond)
Possible solution:If-Then-Else (x ≤ y, y, x)
…. Solving SyGus is hard!20
Talk Outline
Solution Strategies
Conclusions + SyGuS Competition
21
Solving SyGuS as Active Learning:
22
Learning Algorithm
Verification Oracle
Initial examples I
Fail Success
CandidateExpression
Counterexample
Concept class: Set E of expressions
Examples: Concrete input values
Counter-Example Guided Inductive Synthesis
CEGIS Example Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)
Set E: All expressions built from x,y,0,1, Comparison, +, If-Then-Else
23
LearningAlgorithm
Verification Oracle
Examples = { }Candidatef(x,y) = x
Example(x=0, y=1)
CEGIS Example Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)
Set E: All expressions built from x,y,0,1, Comparison, +, If-Then-Else
24
LearningAlgorithm
Verification Oracle
Examples = {(x=0, y=1) } Candidate
f(x,y) = y
Example(x=1, y=0)
CEGIS Example Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)
Set E: All expressions built from x,y,0,1, Comparison, +, If-Then-Else
25
LearningAlgorithm
Verification Oracle
Examples = {(x=0, y=1) (x=1, y=0) (x=0, y=0) (x=1, y=1)} Candidate
ITE (x ≤ y, y,x)
Success
SyGuS Solutions CEGIS approach (Solar-Lezama, Seshia et al)
Related work: Similar strategies for solving quantified formulas and invariant generation
Coming up: Learning strategies based on:Enumerative (search with pruning): Udupa et al (PLDI’13)Symbolic (solving constraints): Gulwani et al (PLDI’11)Stochastic (probabilistic walk): Schkufza et al (ASPLOS’13)
26
Enumerative Learning Find an expression consistent with a given set of concrete
examples
Enumerate expressions in increasing size, and evaluate each expression on all concrete inputs to check consistency
Key optimization for efficient pruning of search space (examples):Expressions e1 and e2 are equivalent if e1(a,b)=e2(a,b) on all concrete values (x=a,y=b) in Examples Only one representative among equivalent subexpressions needs to be considered for building larger expressions
Fast and robust for learning expressions with ~ 15 nodes
27
Symbolic Learning Use a constraint solver for both the synthesis and verification
steps
28
Each production in the grammar is thought of as a component.Input and Output ports of every component are typed.
A well-typed loop-free program comprising these component corresponds to an expression DAG from the grammar.
ITETerm
TermTerm
Cond>=
Term Term
Cond
+
Term Term
Term
xTerm
yTerm
0Term
1Term
Symbolic Learning
29
xn1
xn2
yn3
yn4
0n5
1n6
+n7
+n8
>=n9
ITEn10
Synthesis Constraints:Shape is a DAG, Types are consistentSpec j[f/e] is satisfied on every concrete input values in Examples
Use an SMT solver (Z3) to find a satisfying solution.
If synthesis fails, try increasing the number of occurrences of components in the library in an outer loop
Start with a library consisting of some number of occurrences of each component.
Symbolic Learning - example Iteration 1:
30
Learned counter-example: <x= -1, y=0>
x
Symbolic Learning - example Iteration 2:
31
Learned counter-example: <x= 0, y=-1>
x
ITE
≥ y
xx
Symbolic Learning - example Iteration 3:
32
Learned counter-example: -
x
ITE
≥ y
xy
Stochastic Learning Idea: Use the Metropolis-Hastings Method to find desired
expression e by probabilistic walk on graph where nodes are expressions and edges capture single-edits.
…..in simple words:
Let En be the expressions of size n (n picked randomly).
For every expression e in En set Score(e) between 0 and 1 (“Extent to which e meets the spec φ”)
Score(e) = exp( - 0.5 Wrong(e)), where Wrong(e) = No of examples in I for which ~ j [f/e]
Score(e) is large when Wrong(e) is small. Expressions e with Wrong(e) = 0 more likely to be chosen in the limit than any other expression
33
Initial candidate expression e sampled uniformly from En
When Score(e) = 1, return e
Pick node v in parse tree of e uniformly at random. Replace subtree rooted at e with subtree of same size, sampled uniformly
Stochastic Learning
34
+z
e
+yx
+z
e’
-1z
With probability min{ 1, Score(e’)/Score(e) }, replace e with e’ Repeat until finding e’ such that Score(e’) = 1. Outer loop responsible for updating expression size n
Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y) Set E: All expressions built from x,y,0,1, Comparison, +, If-Then-
Else
Stochastic Learning - example
35
Suppose n = 6 ; there are 768 expressions of size 6
e = ITE(x ≤ 0,y,x) is picked with probability 1/768
The condition x ≤ 0 is mutated to y ≤ 0 with probability 1/6 X 1/48
Suppose the set of concrete examples is: {(-1,-4),(-1, 3),(-1, 2),(1,1),(1,2)} Then Score(e)=exp(-0.5X2), and Score(e’)=exp(-0.5X4)
As e’ is replaced with e with probability exp(-0.5X2)
Note that for e’’ = ITE(x<= y,y,x) we have Score(e’’)=1.
Benchmarks and Implementation Prototype implementation of Enumerative/Symbolic/Stochastic
CEGIS
Benchmarks:Bit-manipulation programs from Hacker’s delightInteger arithmetic: Find max, search in sorted arrayChallenge problems such as computing Morton’s number
Multiple variants of each benchmark by varying grammar
Results are not conclusive as implementations are unoptimized, but offers first opportunity to compare solution strategies
36
Evaluation: Integer Benchmarks
37
array_search_2.sl
array_search_3.sl
array_search_4.sl
array_search_5.sl
max2.sl max3.sl0.01
0.1
1
10
100
1000
Relative Performance of Integer Benchmarks
Enumerative Stochastic (median) Symbolic
app
roxi
mat
e tim
e in
sec.
Evaluation 3: Hacker’s Delight Benchmarks
38
hd-01-d
0-prog
.sl
hd-01
-d5-pr
og.sl
hd-02-d
0-prog
.sl
hd-03
-d0-prog
.sl
hd-03-d
1-prog
.sl
hd-03
-d5-prog
.sl
hd-05
-d1-pr
og.sl
hd-06-d
0-prog
.sl
hd-07-d1-p
rog.sl
hd-09-d
1-prog
.sl
hd-10
-d1-prog
.sl
hd-11-d
0-prog
.sl
hd-11
-d1-prog
.sl
hd-11
-d5-prog
.sl
hd-13-d
0-prog
.sl
hd-13-d5-p
rog.sl
hd-14
-d0-prog
.sl
hd-14
-d1-prog
.sl
hd-14-d
5-prog
.sl
hd-15-d0-p
rog.sl
hd-15
-d1-pr
og.sl
hd-15-d
5-prog
.sl
hd-17-d0-p
rog.sl
hd-17
-d1-prog
.sl
hd-17
-d5-prog
.sl
hd-18-d
1-prog
.sl
hd-18
-d5-prog
.sl
hd-19
-d1-prog
.sl
hd-20
-d0-prog
.sl
hd-20-d5-p
rog.sl
0.01
0.1
1
10
100
1000
Relative Performance on a Sample of Hacker's Delight Benchmarks
Enumerative Stochastic (median) Symbolic
appr
oxim
ate
time
in se
c.
Evaluation Summary Enumerative CEGIS has best performance, and solves many
benchmarks within secondsPotential problem: Synthesis of complex constants
Symbolic CEGIS is unable to find answers on most benchmarksCaveat: Sketch succeeds on many of these
Choice of grammar has impact on synthesis timeWhen E is set of all possible expressions, solvers struggle
None of the solvers succeed on some benchmarksMorton constants, Search in integer arrays of size > 4
Bottomline: Improving solvers is a great opportunity for research !
39
Plan for SyGuS-Comp Proposed competition of SyGuS solvers at FLoC, July 2014
Organizers: Alur, Fisman (Penn) and Singh, Solar-Lezama (MIT)
Website: excape.cis.upenn.edu/Synth-Comp.html
Mailing list: [email protected]
Call for participation:Join discussion to finalize synth-lib format and competition formatContribute benchmarksBuild a SyGuS solver
40