Syntax-Guided Synthesis

40
Syntax-Guided Synthesis Rajeev Alur Joint work with R.Bodik, G.Juniwal, M.Martin, M.Raghothaman, S.Seshia, R.Singh, A.Solar-Lezama, E.Torlak, A.Udupa 1

description

Syntax-Guided Synthesis. Rajeev Alur. Joint work with R.Bodik , G.Juniwal , M.Martin , M.Raghothaman , S.Seshia , R.Singh , A.Solar-Lezama , E.Torlak , A.Udupa. Program Verification. Does a program P meet its specification j, where j is written as a logical formula? - PowerPoint PPT Presentation

Transcript of Syntax-Guided Synthesis

Page 1: Syntax-Guided Synthesis

Syntax-Guided Synthesis

Rajeev Alur

Joint work with R.Bodik, G.Juniwal, M.Martin, M.Raghothaman, S.Seshia, R.Singh, A.Solar-Lezama, E.Torlak, A.Udupa

1

Page 2: Syntax-Guided Synthesis

Program Verification

Does a program P meet its specification j, where j is written as a logical formula?

Motivation: Correctness of systems, finding bugs

Program verification is hard!

Formalizing a structured program into logical formulas

Using tools (SMT solvers) to verify whether the formalized program meets its specification.

SMT-LIB – common standards and library of benchmarks of SMT solvers.

2

Page 3: Syntax-Guided Synthesis

Program Synthesis Automatically synthesize a program P that satisfies a given

specification j

Can potentially have greater impact than program verification

Program synthesis is hard!

Let’s provide a syntactic template for the program – Syntax-Guided Synthesis (SyGuS)

Works on special cases already exist (e.g. Sketch 2008)

Let’s build a common standard and benchmarks for SyGuS solvers (SYNTH-LIB)

3

Page 4: Syntax-Guided Synthesis

Talk Outline

Background: SMT Solvers

Formalization of SyGuS

Solution Strategies

Conclusions + SyGuS Competition

4

Page 5: Syntax-Guided Synthesis

What is SMT?

Satisfiability Modulo Theories

+

Magnus Madsen

Page 6: Syntax-Guided Synthesis

Recall SAT

The Boolean SATisfiability Problem:

• A=TRUE, =FALSE, =FALSE

literal or negated literal

Magnus Madsen

Page 7: Syntax-Guided Synthesis

Recall SAT

• SAT is NP-complete (solveable in exponential time)

• Many SAT solvers exist – DPLL (1962) – Chaff (2001)– MiniSAT (2004)

• Some do remarkably well.Magnus Madsen

Page 8: Syntax-Guided Synthesis

What is an SMT instance?

A logical formula built using– negation, conjunction and disjuction

• e.g. • e.g.

– theory specific operators• e.g. , • e.g.

theory of integers

theory of bitwise

operators

Magnus Madsen

Page 9: Syntax-Guided Synthesis

Q: Why not encode every

formula in SAT?A: Theory

solvers have very efficient

algorithmsGraph Problems:

• Shortest-Path• Minimum Spanning Tree

Optimization:• Max-Flow• Linear Programming

(just to name a few)Magnus Madsen

Page 10: Syntax-Guided Synthesis

Q: But then, Why not get rid

of the SAT solver?

A: SAT solvers are being

studied for a long time

Magnus Madsen

Page 11: Syntax-Guided Synthesis

SAT Theory

Formula

YES

𝑥≥3∧ (𝑥≤0∨ 𝑦 ≥0 )

𝑎∧ (𝑏∨𝑐 )

𝑎∧𝑏

NO

add clause:

𝑎∧𝑐

𝑥≥3∧𝑥≤0𝑥≥3∧ 𝑦 ≥0

YES

SMT Solver

Magnus Madsen

Page 12: Syntax-Guided Synthesis

Theories

Theory of:– Difference Arithemetic– Linear Arithmetic– Arrays– Bit Vectors– Algebraic Datatypes– Uninterpreted Functions

Magnus Madsen

Page 13: Syntax-Guided Synthesis

C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 13

Equivalence Checking of Program Fragmentsint fun1(int y) { int x, z; z = y; y = x; x = z;

return x*x;}

int fun2(int y) { return y*y;} What if we use SAT to check equivalence?

SMT formula Satisfiable iff programs non-equivalent

( z = y y1 = x x1 = z ret1 = x1*x1) ( ret2 = y*y ) ( ret1 ret2 )

Page 14: Syntax-Guided Synthesis

C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 14

Equivalence Checking of Program Fragmentsint fun1(int y) { int x, z; z = y; y = x; x = z;

return x*x;}

int fun2(int y) { return y*y;}

SMT formula Satisfiable iff programs non-equivalent

( z = y y1 = x x1 = z ret1 = x1*x1) ( ret2 = y*y ) ( ret1 ret2 )

Using SAT to check equivalence (w/ Minisat) 32 bits for y: Did not finish in over 5 hours 16 bits for y: 37 sec. 8 bits for y: 0.5 sec.

Page 15: Syntax-Guided Synthesis

C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 15

Equivalence Checking of Program Fragmentsint fun1(int y) { int x, z; z = y; y = x; x = z;

return x*x;}

int fun2(int y) { return y*y;}

SMT formula ’

( z = y y1 = x x1 = z ret1 = sq(x1) ) ( ret2 = sq(y) ) ( ret1 ret2 )

Using EUF solver: 0.01 sec

Page 16: Syntax-Guided Synthesis

Verification Synthesis

16

Program Verification: Does P meet spec j ?

SMT: Is j satisfiable ?

SMT-LIB/SMT-COMP Standard API Solver competition

Program Synthesis: Find P that meets spec j

Syntax-Guided Synthesis

Plan for SyGuS-comp

Page 17: Syntax-Guided Synthesis

Talk Outline

Formalization of SyGuS

Solution Strategies

Conclusions + SyGuS Competition

17

Page 18: Syntax-Guided Synthesis

Syntax-Guided Synthesis (SyGuS) Problem Fix a background theory T: fixes types and operations

Function to be synthesized: name f along with its type

Inputs to SyGuS problem:Specification j (semantic constraint)

Typed formula using symbols in T + symbol f

Set E of expressions given by a context-free grammarSet of candidate expressions that use symbols in T

(syntactic constraint)

Computational problem: Output e in E such that j[f/e] is valid (in theory T)

18

Page 19: Syntax-Guided Synthesis

SyGuS Example Theory QF-LIA

Types: Integers and BooleansLogical connectives, Conditionals, and Linear arithmeticQuantifier-free formulas

Function to be synthesized f (int x, int y) : int

Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)

Candidate Implementations: Linear expressionsLinExp := x | y | Const | LinExp + LinExp | LinExp - LinExp

No solution exists

19

Page 20: Syntax-Guided Synthesis

SyGuS Example Theory QF-LIA

Function to be synthesized: f (int x, int y) : int

Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)

Candidate Implementations: Conditional expressions with comparisons

Term := x | y | Const | If-Then-Else (Cond, Term, Term)Cond := Term <= Term | Cond & Cond | ~ Cond | (Cond)

Possible solution:If-Then-Else (x ≤ y, y, x)

…. Solving SyGus is hard!20

Page 21: Syntax-Guided Synthesis

Talk Outline

Solution Strategies

Conclusions + SyGuS Competition

21

Page 22: Syntax-Guided Synthesis

Solving SyGuS as Active Learning:

22

Learning Algorithm

Verification Oracle

Initial examples I

Fail Success

CandidateExpression

Counterexample

Concept class: Set E of expressions

Examples: Concrete input values

Counter-Example Guided Inductive Synthesis

Page 23: Syntax-Guided Synthesis

CEGIS Example Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)

Set E: All expressions built from x,y,0,1, Comparison, +, If-Then-Else

23

LearningAlgorithm

Verification Oracle

Examples = { }Candidatef(x,y) = x

Example(x=0, y=1)

Page 24: Syntax-Guided Synthesis

CEGIS Example Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)

Set E: All expressions built from x,y,0,1, Comparison, +, If-Then-Else

24

LearningAlgorithm

Verification Oracle

Examples = {(x=0, y=1) } Candidate

f(x,y) = y

Example(x=1, y=0)

Page 25: Syntax-Guided Synthesis

CEGIS Example Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y)

Set E: All expressions built from x,y,0,1, Comparison, +, If-Then-Else

25

LearningAlgorithm

Verification Oracle

Examples = {(x=0, y=1) (x=1, y=0) (x=0, y=0) (x=1, y=1)} Candidate

ITE (x ≤ y, y,x)

Success

Page 26: Syntax-Guided Synthesis

SyGuS Solutions CEGIS approach (Solar-Lezama, Seshia et al)

Related work: Similar strategies for solving quantified formulas and invariant generation

Coming up: Learning strategies based on:Enumerative (search with pruning): Udupa et al (PLDI’13)Symbolic (solving constraints): Gulwani et al (PLDI’11)Stochastic (probabilistic walk): Schkufza et al (ASPLOS’13)

26

Page 27: Syntax-Guided Synthesis

Enumerative Learning Find an expression consistent with a given set of concrete

examples

Enumerate expressions in increasing size, and evaluate each expression on all concrete inputs to check consistency

Key optimization for efficient pruning of search space (examples):Expressions e1 and e2 are equivalent if e1(a,b)=e2(a,b) on all concrete values (x=a,y=b) in Examples Only one representative among equivalent subexpressions needs to be considered for building larger expressions

Fast and robust for learning expressions with ~ 15 nodes

27

Page 28: Syntax-Guided Synthesis

Symbolic Learning Use a constraint solver for both the synthesis and verification

steps

28

Each production in the grammar is thought of as a component.Input and Output ports of every component are typed.

A well-typed loop-free program comprising these component corresponds to an expression DAG from the grammar.

ITETerm

TermTerm

Cond>=

Term Term

Cond

+

Term Term

Term

xTerm

yTerm

0Term

1Term

Page 29: Syntax-Guided Synthesis

Symbolic Learning

29

xn1

xn2

yn3

yn4

0n5

1n6

+n7

+n8

>=n9

ITEn10

Synthesis Constraints:Shape is a DAG, Types are consistentSpec j[f/e] is satisfied on every concrete input values in Examples

Use an SMT solver (Z3) to find a satisfying solution.

If synthesis fails, try increasing the number of occurrences of components in the library in an outer loop

Start with a library consisting of some number of occurrences of each component.

Page 30: Syntax-Guided Synthesis

Symbolic Learning - example Iteration 1:

30

Learned counter-example: <x= -1, y=0>

x

Page 31: Syntax-Guided Synthesis

Symbolic Learning - example Iteration 2:

31

Learned counter-example: <x= 0, y=-1>

x

ITE

≥ y

xx

Page 32: Syntax-Guided Synthesis

Symbolic Learning - example Iteration 3:

32

Learned counter-example: -

x

ITE

≥ y

xy

Page 33: Syntax-Guided Synthesis

Stochastic Learning Idea: Use the Metropolis-Hastings Method to find desired

expression e by probabilistic walk on graph where nodes are expressions and edges capture single-edits.

…..in simple words:

Let En be the expressions of size n (n picked randomly).

For every expression e in En set Score(e) between 0 and 1 (“Extent to which e meets the spec φ”)

Score(e) = exp( - 0.5 Wrong(e)), where Wrong(e) = No of examples in I for which ~ j [f/e]

Score(e) is large when Wrong(e) is small. Expressions e with Wrong(e) = 0 more likely to be chosen in the limit than any other expression

33

Page 34: Syntax-Guided Synthesis

Initial candidate expression e sampled uniformly from En

When Score(e) = 1, return e

Pick node v in parse tree of e uniformly at random. Replace subtree rooted at e with subtree of same size, sampled uniformly

Stochastic Learning

34

+z

e

+yx

+z

e’

-1z

With probability min{ 1, Score(e’)/Score(e) }, replace e with e’ Repeat until finding e’ such that Score(e’) = 1. Outer loop responsible for updating expression size n

Page 35: Syntax-Guided Synthesis

Specification: (x ≤ f(x,y)) & (y ≤ f(x,y)) & (f(x,y) = x | f(x,y) = y) Set E: All expressions built from x,y,0,1, Comparison, +, If-Then-

Else

Stochastic Learning - example

35

Suppose n = 6 ; there are 768 expressions of size 6

e = ITE(x ≤ 0,y,x) is picked with probability 1/768

The condition x ≤ 0 is mutated to y ≤ 0 with probability 1/6 X 1/48

Suppose the set of concrete examples is: {(-1,-4),(-1, 3),(-1, 2),(1,1),(1,2)} Then Score(e)=exp(-0.5X2), and Score(e’)=exp(-0.5X4)

As e’ is replaced with e with probability exp(-0.5X2)

Note that for e’’ = ITE(x<= y,y,x) we have Score(e’’)=1.

Page 36: Syntax-Guided Synthesis

Benchmarks and Implementation Prototype implementation of Enumerative/Symbolic/Stochastic

CEGIS

Benchmarks:Bit-manipulation programs from Hacker’s delightInteger arithmetic: Find max, search in sorted arrayChallenge problems such as computing Morton’s number

Multiple variants of each benchmark by varying grammar

Results are not conclusive as implementations are unoptimized, but offers first opportunity to compare solution strategies

36

Page 37: Syntax-Guided Synthesis

Evaluation: Integer Benchmarks

37

array_search_2.sl

array_search_3.sl

array_search_4.sl

array_search_5.sl

max2.sl max3.sl0.01

0.1

1

10

100

1000

Relative Performance of Integer Benchmarks

Enumerative Stochastic (median) Symbolic

app

roxi

mat

e tim

e in

sec.

Page 38: Syntax-Guided Synthesis

Evaluation 3: Hacker’s Delight Benchmarks

38

hd-01-d

0-prog

.sl

hd-01

-d5-pr

og.sl

hd-02-d

0-prog

.sl

hd-03

-d0-prog

.sl

hd-03-d

1-prog

.sl

hd-03

-d5-prog

.sl

hd-05

-d1-pr

og.sl

hd-06-d

0-prog

.sl

hd-07-d1-p

rog.sl

hd-09-d

1-prog

.sl

hd-10

-d1-prog

.sl

hd-11-d

0-prog

.sl

hd-11

-d1-prog

.sl

hd-11

-d5-prog

.sl

hd-13-d

0-prog

.sl

hd-13-d5-p

rog.sl

hd-14

-d0-prog

.sl

hd-14

-d1-prog

.sl

hd-14-d

5-prog

.sl

hd-15-d0-p

rog.sl

hd-15

-d1-pr

og.sl

hd-15-d

5-prog

.sl

hd-17-d0-p

rog.sl

hd-17

-d1-prog

.sl

hd-17

-d5-prog

.sl

hd-18-d

1-prog

.sl

hd-18

-d5-prog

.sl

hd-19

-d1-prog

.sl

hd-20

-d0-prog

.sl

hd-20-d5-p

rog.sl

0.01

0.1

1

10

100

1000

Relative Performance on a Sample of Hacker's Delight Benchmarks

Enumerative Stochastic (median) Symbolic

appr

oxim

ate

time

in se

c.

Page 39: Syntax-Guided Synthesis

Evaluation Summary Enumerative CEGIS has best performance, and solves many

benchmarks within secondsPotential problem: Synthesis of complex constants

Symbolic CEGIS is unable to find answers on most benchmarksCaveat: Sketch succeeds on many of these

Choice of grammar has impact on synthesis timeWhen E is set of all possible expressions, solvers struggle

None of the solvers succeed on some benchmarksMorton constants, Search in integer arrays of size > 4

Bottomline: Improving solvers is a great opportunity for research !

39

Page 40: Syntax-Guided Synthesis

Plan for SyGuS-Comp Proposed competition of SyGuS solvers at FLoC, July 2014

Organizers: Alur, Fisman (Penn) and Singh, Solar-Lezama (MIT)

Website: excape.cis.upenn.edu/Synth-Comp.html

Mailing list: [email protected]

Call for participation:Join discussion to finalize synth-lib format and competition formatContribute benchmarksBuild a SyGuS solver

40