Research Article A Unified Framework for DPLL( T...

14
Hindawi Publishing Corporation Journal of Applied Mathematics Volume 2013, Article ID 964682, 13 pages http://dx.doi.org/10.1155/2013/964682 Research Article A Unified Framework for DPLL(T) + Certificates Min Zhou, 1,2,3,4 Fei He, 1,2,3 Bow-Yaw Wang, 5 Ming Gu, 1,2,3 and Jiaguang Sun 1,2,3 1 Tsinghua National Laboratory for Information Science and Technology (TNList), Beijing 100084, China 2 School of Soſtware, Tsinghua University, Beijing 100084, China 3 Key Laboratory for Information System Security, MOE, Beijing 100084, China 4 Department of Computer Science and Technologies, Tsinghua University, Beijing 100084, China 5 Institute of Information Science, Academia Sinica, Taipei 115, Taiwan Correspondence should be addressed to Min Zhou; [email protected] Received 6 February 2013; Accepted 8 April 2013 Academic Editor: Xiaoyu Song Copyright © 2013 Min Zhou et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Satisfiability Modulo eories (SMT) techniques are widely used nowadays. SMT solvers are typically used as verification backends. When an SMT solver is invoked, it is quite important to ensure the correctness of its results. To address this problem, we propose a unified certificate framework based on DPLL(T), including a uniform certificate format, a unified certificate generation procedure, and a unified certificate checking procedure. e certificate format is shown to be simple, clean, and extensible to different background theories. e certificate generation procedure is well adapted to most DPLL(T)-based SMT solvers. e soundness and completeness for DPLL(T) + certificates were established. e certificate checking procedure is straightforward and efficient. Experimental results show that the overhead for certificates generation is only 10%, which outperforms other methods, and the certificate checking procedure is quite time saving. 1. Introduction 1.1. Background and Motivation. Satisfiability Modulo e- ories (SMT) techniques are getting increasingly popular in many verification applications. An SMT solver takes as input an arbitrary formula given in a specific fragment of first order logic with interpreted symbols. It returns a judgment telling whether the formula is satisfiable. By satisfiable, it means that there exists at least one interpretation under which the formula evaluates to true. SMT solvers are typically used as verification backends. During a verification process, SMT solver may be invoked many times. Each time the SMT solver is given a complex logical formula and is expected to give a correct judg- ment. However, SMT solvers employ many sophisticated data structures and tricky algorithms, which make their implementations prone to error. For instance, there are some solvers disqualified in SMT-COMP due to their incorrect judgments [1]. We therefore strongly believe that SMT solvers should be supported with a formal assessment of their correctness. Ensuring the correctness of an SMT solver is never easy. Generally, there are two approaches: to verify the SMT solver or to certify their judgments. For the former approach, it proves directly that the solver is correct. So, its judgements are naturally correct. ough rigor in logic, this method is not generally practical because verifying an SMT solver requires great workload. Not only the algorithm but also the implementation should be verified in a formal way. To the best of our knowledge, no state-of-the-art SMT solver has been completely verified. More importantly, this approach is solver dependent. Even if a solver is completely verified, the whole verification has to be done again when the solver is modified. e rationale of the latter approach is as follows: instead of proving the correctness of the solver, we prove its judg- ment. For this purpose, the solver is required to return a piece of evidence to support its judgment. is evidence is called certificate in this paper. en, the correctness for the judgment is ensured by the certificate. If so, a rigorous and detailed proof can be constructed by referring to the certifi- cate. Obviously, checking a certificate is much easier than

Transcript of Research Article A Unified Framework for DPLL( T...

Hindawi Publishing CorporationJournal of Applied MathematicsVolume 2013 Article ID 964682 13 pageshttpdxdoiorg1011552013964682

Research ArticleA Unified Framework for DPLL(T) + Certificates

Min Zhou1234 Fei He123 Bow-Yaw Wang5 Ming Gu123 and Jiaguang Sun123

1 Tsinghua National Laboratory for Information Science and Technology (TNList) Beijing 100084 China2 School of Software Tsinghua University Beijing 100084 China3 Key Laboratory for Information System Security MOE Beijing 100084 China4Department of Computer Science and Technologies Tsinghua University Beijing 100084 China5 Institute of Information Science Academia Sinica Taipei 115 Taiwan

Correspondence should be addressed to Min Zhou zhoumin03gmailcom

Received 6 February 2013 Accepted 8 April 2013

Academic Editor Xiaoyu Song

Copyright copy 2013 Min Zhou et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

SatisfiabilityModuloTheories (SMT) techniques are widely used nowadays SMT solvers are typically used as verification backendsWhen an SMT solver is invoked it is quite important to ensure the correctness of its results To address this problem we propose aunified certificate framework based on DPLL(T) including a uniform certificate format a unified certificate generation procedureand a unified certificate checking procedure The certificate format is shown to be simple clean and extensible to differentbackground theories The certificate generation procedure is well adapted to most DPLL(T)-based SMT solvers The soundnessand completeness for DPLL(T) + certificates were established The certificate checking procedure is straightforward and efficientExperimental results show that the overhead for certificates generation is only 10 which outperforms other methods and thecertificate checking procedure is quite time saving

1 Introduction

11 Background and Motivation Satisfiability Modulo The-ories (SMT) techniques are getting increasingly popular inmany verification applications An SMT solver takes as inputan arbitrary formula given in a specific fragment of first orderlogic with interpreted symbols It returns a judgment tellingwhether the formula is satisfiable By satisfiable it meansthat there exists at least one interpretation under which theformula evaluates to true

SMT solvers are typically used as verification backendsDuring a verification process SMT solver may be invokedmany times Each time the SMT solver is given a complexlogical formula and is expected to give a correct judg-ment However SMT solvers employ many sophisticateddata structures and tricky algorithms which make theirimplementations prone to error For instance there are somesolvers disqualified in SMT-COMP due to their incorrectjudgments [1]We therefore strongly believe that SMT solversshould be supported with a formal assessment of theircorrectness

Ensuring the correctness of an SMT solver is never easyGenerally there are two approaches to verify the SMT solveror to certify their judgments For the former approach itproves directly that the solver is correct So its judgementsare naturally correct Though rigor in logic this methodis not generally practical because verifying an SMT solverrequires great workload Not only the algorithm but also theimplementation should be verified in a formal way To thebest of our knowledge no state-of-the-art SMT solver hasbeen completely verified More importantly this approach issolver dependent Even if a solver is completely verified thewhole verification has to be done again when the solver ismodified

The rationale of the latter approach is as follows insteadof proving the correctness of the solver we prove its judg-ment For this purpose the solver is required to return apiece of evidence to support its judgment This evidence iscalled certificate in this paper Then the correctness for thejudgment is ensured by the certificate If so a rigorous anddetailed proof can be constructed by referring to the certifi-cate Obviously checking a certificate is much easier than

2 Journal of Applied Mathematics

SMT solverbased

Certificate generation

Certificate CNFCertificate items

Certificate checker

Lemmachecker

Lemmachecker middot middot middot Lemma

checker

DPLL(T)

Figure 1 The road map of our approach

verifying a solver If the certificates are presented in a properformat its checking algorithm could have low complexity Forinstance linear time is required to check certain certificatesfor unsatisfiable SAT problems The certificate approach issolver independent We need not to look into the solvers theonly requirement for solvers is to generate certificates If thecertificate format is unified the certificate checking can be auniform procedure This benefit is vital since we need not todevelop a certificate checker for each SMT solver

There are some SMT solvers that support certificategeneration such as CVC3 [2] and Z3 [3] However their cer-tificate formats are quite different As a result the generationand checking procedures are tool specific Although SMTsolvers differ in the algorithms we believe that their formatsof certificates could be unified The basic idea comes fromSAT In an SAT community people consider to generate cer-tificates along with the DPLL framework and the certificates(for unsatisfiability instances) are unified as chains of linearregular resolutions which can lead from the initial clauses toan empty clause [4 5] With this uniform format the gen-eration and checking procedures for certificates can also beunified

This paper considers the DPLL(T) framework which isextended from DPLL and has been widely used in state-of-the-art SMT solvers A uniform format of SMT certificate isdefined in this paperThe roadmap of our approach is shownin Figure 1 Note that the size of certificate can be exponentialor even largerwith respect to the input problem [6] To reducestorage space the certificate format is defined carefully ina concise way Upon this format a unified procedure forcertificates generation is proposed This procedure can beeasily integrated intoDPLL(T)-based SMT solversMoreovera certificate checking procedure is also proposed in this paperThe checking procedure is easy fast and memory saving Tomake it extensible theory lemmas are checked by individuallemma checkers Experimental results show that the averageoverhead for the certificates generation is about 10

12 Related Work Propositional satisfiability problem (SAT)is a well-known decision problem It was proven to be NP-complete by Cook [7] in 1971 All known algorithms for SAThave exponential complexity State-of-the-art SAT solvers arebased on DPLL [8] algorithm The basic idea is to exploreall valuations by depth-first search in a branch and boundmanner To reduce unnecessary search efforts an optimi-zation called clause learning is used which learns a clausewhenever it recovers from a conflictThis optimization signif-icantly increases the performance andmakes SAT algorithms

practical During the clause learning process each learnedclause is a resolvent of linear regular resolution from existingclauses [4]

SMT is an extension of SAT As the name implies theinput formula may contain theory-specific predicates andfunctions For instance while only boolean variables areallowed in SAT 119909 ge 119910 and 119910 ge 119891(119886) + 119909 or 119909 lt 0 is a well-formedSMT formula The background theory T of SMT is usuallya composed theory from several individual theories In thispaper we focus on quantifier-free SMT problems for whichthe popular algorithm is DPLL(T) [9] The DPLL(T) algo-rithm is extended from DPLL Since the DPLL(T) algorithmis described as a transition system solving an SMT problemis actually seeking a path to the success or fail state whilerespecting the guard conditions on transitions Althoughthere are other heuristics and optimizations currently it isalmost standard to develop SMT solvers based on DPLL(T)

Certificates are informally a set of evidence that can beused to construct a rigorous proof for the judgment Forsatisfiable cases the certificate could be a model What ismore interesting is the certificates for unsatisfiable casesDeduction steps from the input problem to a conflict shouldbe reproducible by referring to the certificate In this paperwhen saying certificates we refer to those for unsatisfiablecases

Many of the state-of-the-art SAT solvers can generatecertificates but their logical formats are very distinct fromeach other For example the certificate generated by zChaff[10] and MiniSat [11] are quite different PicoSAT [12] gen-erates concise and well-defined certificates A tool calledTraceCheck can do certificate checking [12] In PicoSATrsquos cer-tificates there are initial clauses and resolvents Resolvents areobtained from linear regular resolution If all the clauses forma directional acyclic graph and the empty clause is derivedthen it implies that the initial clause set is unsatisfiable Asshown in [4 5] clause learning in DPLL corresponds to alinear regular resolution So this certificate format is alsoadoptable by other DPLL-based solvers

For general SMT problems quantifiers may be used Fur-thermore theory-specific inference rules should be consid-ered when deciding their satisfiability Various logical frame-works are developed to provide flexible languages to writetheir proof such reference could be found in [13ndash16] Ifwe focus on quantifier-free problems the proof format andproof checking could be simplified In practice solvers suchas CVC3 and Z3 can generate certificates but their formatsare also quite different because they use different proofrules Proof rules are a set of inference rules with respect to

Journal of Applied Mathematics 3

the background theory T To be better suited for certificategeneration and checking the certificate format should notdepend heavily on theory-specific proof rules

The overhead for generating certificates should be assmall as possible An approach with 30 overhead is pre-sented in [17] which uses the Edinburgh Logical Frameworkwith Side Condition (LFSC) In our approach we focus onquantifier-free problems so that the overhead is further re-duced Certificate generation is related to unsatisfiable corecomputation In [18] a simple and flexible way of computingsmall unsatisfiable core in SMT was proposedThey computeby the propositional abstraction of SMT problem and invokean existing unsatisfiable core extractor for SAT Althoughthe idea might be similar unsatisfiable core computationconcerns which clauses lead to unsatisfiability whereas cer-tificate generation concerns how It is convenient to generateunsatisfiable core from certificates but not vice versa

To check certificates people can either check the certifi-cate directly just as in the SAT case or translate them intoinputs for some theorem provers such as HOL Light [19] orIsabelleHOL [20] There is also research in checking certif-icates using simple term rewriting [21] A known fact is thatif certificates are checked by translating them to proof itemsfor theorem provers it could take much longer time thancertificate finding [20]

13 Contribution In thispaper wepresent a certificate frame-work for quantifier-free formulas Compared to other worksour approach has the following advantages

(i) The certificate format is simple clean and extensibleto different underlying theories

(ii) The certificate generation procedure is well adaptedto most DPLL(T)-based SMT solvers We also imple-ment it in our solver

(iii) On average the certificate generation induces about10 overhead which is much less than other ap-proaches

(iv) The certificate checking procedure is simplified It hasbetter performance

The rest of this paper is organized as follows In Section 2we recall the original DPLL(T) algorithm The certificateformat is defined in Section 3 The framework of DPLL(T) +certificate is described in Section 4 where the formal rulesare defined in Section 41 necessary implementation issuesare discussed in Section 42 the properties are discussedin Section 43 and a couple of examples are shown inSection 44 Section 5 discusses the certificate checking mat-ters Experimentsrsquo results are shown in Section 6 FinallySection 7 concludes the paper

2 The DPLL(T) Algorithm

The DPLL(T) algorithm was proposed in [9] The input isa quantifier-free first order formula in conjunctive normalform (CNF) from the background theory T DPLL(T) wasgiven as a transition system Most features and optimizationsof SMT algorithms can be formalized in that way

In first order logic any CNF formula can be viewed as aset of clauses Each clause is disjunction of literals and eachliteral is a positive or negative form of an atom A formulais satisfiable if there exists an interpretation under which theformula evaluates to true

In DPLL(T) formula satisfiability is considered in somebackground theory T (all interpreted symbols should befrom T) We use ldquo⊨Trdquo to denote theory entailment in TGiven sets of formulas Σ and Γ if any interpretation thatsatisfies all formulas in Σ also satisfies all formulas Γ we writeΣ⊨TΓ Similarly ldquo⊨rdquo denotes propositional entailment whichconsider each atomic formula syntactically as a propositionalliteral

Each state is a 2-tuple ldquo119872 119865rdquo where 119872 is a stack ofthe currently asserted literals and 119865 the set of clauses to besatisfied The transition relations are given by the followingrules

(i) Decide

Precondition

(1) 119897 or not119897 occurs in a clause of 119865

(2) 119897 is undefined in 119872

119872 119865 997891997888rarr 119872 119897d

119865 (1)

(ii) UnitPropagate

Precondition

(1) 119872 ⊨ not119862

(2) 119897 is undefined in 119872

119872 119865 119862 or 119897 997891997888rarr 119872 119897 119865 119862 or 119897 (2)

(iii) TheoryPropagate

Precondition

(1) 119872⊨T119897

(2) 119897 or not119897 occurs in 119865

(3) 119897 is undefined in 119872

119872 119865 997891997888rarr 119872 119897 119865 (3)

(iv) T-Backjump

Precondition

(1) 119872 119897d119873 ⊨ not119862

(2) there is some clause 1198621015840

or 1198971015840 such that

(a) 119865 119862⊨T 1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840(b) 1198971015840 is undefined in 119872

(c) 1198971015840 or not119897

1015840 occurs in 119865 or in 119872 119897d119873

119872 119897d119873 119865 119862 997891997888rarr 119872 119897

1015840 119865 119862 (4)

4 Journal of Applied Mathematics

(v) T-Learn

Precondition(1) each atom of 119862 occurs in 119865 or in 119872(2) 119865⊨T119862

119872 119865 997891997888rarr 119872 119865 119862 (5)

(vi) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 997891997888rarr 119872 119865 (6)

(vii) Restart

Precondition

T

119872 119865 997891997888rarr 0 119865 (7)

(viii) Fail

Precondition(1) 119872 ⊨ not119862(2) 119872 contains no decision literals

119872 119865 119862 997891997888rarr ⟨Fail⟩ (8)

In DPLL(T) algorithm there is an assumption that thereexists a T-solver that can check the consistency of conjunc-tions of literals given in T This algorithm will terminateunder weak assumption and is proven sound and complete[9]

3 The Certificate Format

If a formula is satisfiable the certificate is simply an inter-pretation under which the formula evaluates to true On theother hand if it is unsatisfiable the certificate could be muchmore complicated It should be proven that ldquoall attempts tofind amodel failedrdquoMore precisely each branch of the searchtreemust be tried Technically speaking this closed search treecan be presented as a refutation procedure which containssequences of resolution procedures that produce the emptyclause [12] It is imaginable that if the search tree is quite largeso is the set of refutation clauses

Remember that in first order logic (FOL) constants areconsidered as nullary functions Then a term is either avariable or a function with arguments which are also termsFor example 119909 119891(119909 119910) and 120601(119904 120595(119905)) are valid terms Anatom (or atomic formula) is a predicate with arguments beingterms Note that nullary predicates are actually propositionalvariables

Definition 1 (T-atom T-literal) Given a background theoryT a T-atom is an atom whose functions and predicates are inthe signature of T A T-literal is the positive or negative formof a T-atom

For example given T the theory of Equality with Unin-terpreted Functions if 119901 is a propositional variable then 119901119909 = 119891(119910) are T-atoms and 119901 not119901 119909 = 119891(119910) and 119909 = 119891(119910) areT-literals

A proof rule is an inference rule which has several T-literals as premises and one T-literal as the conclusion Forproof rules whose conclusion is conjunction of 119899 T-literalswe can split it to 119899 proof rules and one for each) For example119909 = 119910 implies that 119891(119909) = 119891(119910) is a proof rule (themonotonicity of equality)This rule holds for all T-terms 119909 119910

and all function 119891 Each proof rule can be instantiated to atheory lemma For instance if 119904 119905 are two specific T-termsand 119892 is a specific function then (119904 = 119905) rarr (119892(119904) = 119892(119905)) is atheory lemma

Definition 2 (clause item) A clause item is one of the follow-ing three forms

(i) Init 1198971

or sdot sdot sdot or 119897119899 an initial clause 119897

1or sdot sdot sdot or 119897

119899 where all

119897119894are T-literals

(ii) Res 1198621 119862

119899 a clause obtained from a linear

resolution of1198621 119862

119899 where all119862

119894are clauses items

(iii) Lemma 1198971

and sdot sdot sdot and 119897119899

rarr 1198971015840 a theory lemma where all

119897119894and 1198971015840 are T-literals The defined clause is not119897

1or sdot sdot sdot or

not119897119899

or 1198971015840

A resolution chain 1198621 1198622 119862

119899is called a linear reg-

ular resolution chain if there exists a sequence of clauses1198631 1198632 119863

119899minus1such that (1) 119863

1is the resolvent of 119862

1and

1198622 (2) for 2 le 119894 lt 119899 119863

119894is the resolvent of 119863

119894minus1and 119862

119894+1 (3)

all resolutions are applied on different T-atomsA clause item gives the syntax form for a clause For

each clause item 119862 if it is well defined the concrete formof 119862 (given as disjunction of T-literals) can be calculatedIt is called a concrete clause denoted by ⟦119862⟧ We do notdistinguish 119862 and ⟦119862⟧ when no ambiguity is caused

Definition 3 (certificate item) A certificate item is either ofthe following

(i) Define ClauseItem

(ii) Forget ClauseItem

Definition 4 (certificate) A certificate 119875 with respect to anSMT instance is a sequence of certificate items The size of 119875denoted by |119875| is the number of certificate items containedin 119875 119875[119894] is the 119894th element in 119875 for 119894 = 1 2 |119875|

Definition 5 (context) Given a certificate119875 and a number 0 le

119894 le |119875| the context with respect to 119894 denoted by lfloor119875rfloor is a setof clause items such that

lfloor119875rfloor119894

=

0 if 119894 = 0

lfloor119875rfloor119894minus1

cup 119862 if 119894 gt 0 and 119875 [119894 minus 1] is Define119862

lfloor119875rfloor119894minus1

119862 if 119894 gt 0 and 119875 [119894 minus 1] is Forget119862

(9)

Specially denote lfloor119875rfloor|119875|

by lfloor119875rfloor

Journal of Applied Mathematics 5

Table 1 A certificate example

Clause item ⟦119862⟧ Lemma1198621Define Init not119897

1or not1198972

not1198971

or not1198972

1198622Define Init 119897

11198971

1198623Define Lemma 119897

1rarr 1198972

not1198971

or 1198972

The property of ldquo=rdquo1198624Define Res 119862

1 1198623 1198622

◻ The empty clause

Example 6 Assume that 1198971

119909 = 119910 1198972

119891(119909) = 119891(119910) and1198621

not1198971

or not1198972 1198622

1198971 A certificate for the unsatisfiability of

the clause set 1198621 1198622 is presented in Table 1 In this example

1198621 1198622 is intuitively unsatisfiable Note that 119862

2entails 119897

1

which implies 1198972in TEUF while 119862

1 1198622

⊨ not1198972

Definition 7 (well-formed certificate) A certificate 119875 is wellformed if for all 1 le 119894 le |119875|

(i) if 119875[119894] is a Define item then one of the followingconditions hold(a) it is of Init type(b) it is of Res 119862

1 119862

119899 type where 119862

119896isin lfloor119875rfloor

119894minus1

for all 1 le 119896 le 119899 and there should be a linearregular resolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧

to 119875[119894](c) it is of Lemma type and the concrete clause is

valid in theory T that is ⊨T⟦119875[119894]⟧(ii) otherwise 119875[119894] is a Forget item and the referred clause

must be defined in lfloor119875rfloor119894minus1

4 Certificate Generation with DPLL(T)

The certificate generation algorithm is described in this sec-tion We first describe the abstract rule and implementationissues then prove some important properties and finally givea couple of examples

41 The CDPLL(T) Algorithm We describe here a certifi-cate generation procedure that can be well adapted to theDPLL(T) algorithm The extension of DPLL(T) with certifi-cate generation is called CDPLL(T)

Definition 8 (certificate refinement) The certificate obtainedby appending a certificate item 119862 to the end of the certificate119875 is denoted by 119875 ⊲ 119862

We use a transition system to model the CDPLL(T)algorithm Each state is represented as a 3-tuple ldquo119872 119865 oplus 119875rdquowhere119872 is the literal stack119865 the clause set to be satisfied and119875 the current certificateThe transition rules inCDPLL(T) arefollows

(i) Decide

Precondition(1) 119897 or not119897 occurs in a clause of 119865(2) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897d

119865 oplus 119875 (10)

(ii) UnitPropagate

Precondition

(1) 119872 ⊨ not119862(2) 119897 is undefined in 119872

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897 119865 119862 or 119897 oplus 119875 (11)

(iii) TheoryPropagate

Precondition

(1) 119872⊨T119897(2) 119897 or not119897 occurs in 119865(3) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897 119865 oplus 119875 (12)

(iv) T-Backjump

Precondition

(1) 119872 119897d119873 ⊨ not119862

(2) there exists a clause 1198621015840or 1198971015840 such that (a) 119865 119862⊨T119862

1015840or 1198971015840

and 119872 ⊨ not1198621015840 (b) 119897

1015840 is undefined in 119872 and (c) 1198971015840 or

not1198971015840 occurs in 119865 or in 119872 119897

d119873

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

1015840or 1198971015840

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840 119865 119862 oplus (119875 ⊲ Res (119877)) (13)

(v) T-Learn(I) this rule models the clause learning proce-dure

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) 119865 ⊨ 119862

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Res (119877)) (14)

(vi) T-Learn(II) this rule models the learning of theorylemma

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) ⊨T119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Lemma (119862)) (15)

6 Journal of Applied Mathematics

(vii) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 oplus 119875 997891997888rarr 119872 119865 oplus (119875 ⊲ Forget (119862)) (16)

(viii) Restart

Precondition

T

119872 119865 oplus 119875 997891997888rarr 0 119865 oplus 119875 (17)

(ix) Fail

Precondition

(1) 119872 ⊨ not119862

(2) 119872 contains no decision literals

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to ◻

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (18)

The initial state is 119872 119865 oplus 1198750where 119875

0is the certificate

that defines all initial clauses When a final state ⟨Fail⟩ oplus 119875

is reached 119875 is the obtained certificate for the unsatisfiabilityjudgment

The major differences between CDPLL(T) and DPLL(T)are as follows (1) Certificate generation is added Notice themodifications to rules T-Backjump T-Learn T-Forget andT-Fail (2) In order to generate well-formed certificates theoriginal T-Learn rule is split into 2 cases one for the learningof resolvents (corresponds to clause learning) and the otherfor the learning of theory lemmas (corresponds to deductionsinT)The first three rules are unchanged except the certificatepart

42 Implementation of CDPLL(T) In a real implementationof CDPLL(T) additional information need to be collectedto construct the certificate items In this section we discussthese details

Lemma 9 For any reachable state 119872 119865 oplus 119875 in CDPLL(T)and any clause 119862 isin 119865 there exists a clause item 119863 isin lfloor119875rfloor suchthat ⟦119863⟧ = 119862

Proof We prove that by induction Initially this propertyholds because all clauses in 119865

0are initial clauses and 119875

0

consists of all initial clause items (by definition) At each validtransition step the clause set 119865 and the certificate 119875 may bemodified However each rule ensures that whenever a clause119862 is added to 119865 there is always a clause item 119863 such that⟦119863⟧ = 119862 added to 119875 for example in the rule T-Learn(I)Furthermore whenever a clause item is removed from 119875

the corresponding clause is removed in 119865 for example in therule T-Forget Thus the property holds on all valid trace ofCDPLL(T)

421 Record Reasons For any literal 119897 in 119872 it is either addedby the Decide rule or forced by a clause or theory lemmaFor the latter case we need to record a clause item which isa deduction of the lemma and from which the literal 119897 canbe enforced We call this item the reason for literal 119897 beingin 119872

Only rules UnitPropagate TheoryPropagate and T-Backjump can enforce new literal to 119872 We discuss in thefollowing the reasons recorded by these three rules

(i) UnitPropagate the reason is a certificate item 119863 suchthat ⟦119863⟧ = 119862 or 119897 isin 119865 By Lemma 9 such item alwaysexists

(ii) TheoryPropagate assume that 119872 = 1198971

and 1198972

and sdot sdot sdot and 119897119898

and119872⊨T119897 then the reason is a certificate item119863 suchthat ⟦119863⟧ = not119897

1or not1198972

or not sdot sdot sdot or not119897119898

or 119897(iii) T-Backjump the reason for 119897

1015840 is a certificate item119863 such that ⟦119863⟧ = 119862

1015840or 1198971015840 We will show in the

following that such certificate item always exists whenT-Backjump is applicable

For convenience a subscript is used to denote the reason forexample 119897

(119863)denotes the literal 119897 asserted by the reason ⟦119863⟧

422 Learn Clauses In rules T-Backjump and Fail oncethere is a conflict a backjump clause needs be learned Fur-thermore if there are some branches that have not beenexplored T-Backjump is applied otherwise Fail is appliedEach clause learning procedure introduces some new certifi-cate items

We use implication graph to analyze the clause learningprocess Remember that in DPLL all literals are propositionalatoms the implication graph is simply a direct acyclic graph(DAG) where each edge corresponds to a deduction step InCDPLL(T) the implication graph has two kinds of deduc-tions propositional deduction (which is similar to DPLL)and theory deduction Each deduction step in the implicationgraph is labelled with a reason 119862This graph can be viewed asa generalized implication graph Without causing ambiguitywe still call it an implication graph

In the implication graph of DPLL each cut containing theconflict literals corresponds to a learned clause (by resolvingall the associated clauses of edges in this cut) [4] No matterwhich criteria (eg 1-UIP) is used the rule for clause learn-ing is applicable For CDPLL(T) when considering the gen-eralized implication graphs the clause learning rule is alsoapplicable

There are two situations where T-Backjump can be ap-plied Given a state 119872 119865 oplus 119875 the first situation is thatthere exists a clause 119862 isin 119865 which is falsified by 119872 that is⊭ 119872 119862 Then 119862 is called the conflict clause As in the rulesome backjump clause 119862

1015840or 1198971015840 should be learned by analyzing

the conflict Usually1198621015840or1198971015840 is a resolvent of a linear resolution

A certificate item with resolution type will be learned

Journal of Applied Mathematics 7

not1198971

1198971

1198970119901

1198622

1198620

1198623

Cut1 Cut2

Figure 2 Example of a generalized implication graph

The other situation is that 119872 itself becomes T-inconsist-ent Then a subset 119897

1 1198972 119897119898

of 119872 is unsatisfiable that is⊨Tnot1198971

or not1198972

or sdot sdot sdot or not119897119898 This is also a conflict clause It may

not belong to 119865 but it can definitely be learned from 119865 alongwith some theory lemmas The fact that 119897

1 119897119898leads to

conflict is represented in the generalized implication graphFurthermore the clause 119862

1015840or 1198971015840 that is required by the rule T-

Backjump can also be learned by generalized conflict clauseanalysis

In both situations the backjump clauses are learned byclause learning The certificate for unsatisfiability is actuallya well-organized collection of these clause items If Fail isapplicable no literal in119872 is decision variable then the learntclause is the empty clause which shows 119865⊨T◻

Example 10 A generalized implication graph is shown inFigure 2 where the literals are as follows 119897

0 119909 = 119910 119897

1

119891(119909) = 119891(119910) and 119901 is a propositional literal Clauses are asfollows

1198620

not1198970

or not1198971

= not (119909 = 119910) or (119891 (119909) = 119891 (119910))

1198621

1198970

or 119901 = 119909 = 119910 or 119901

1198622

1198970

or not119901 = 119909 = 119910 or not119901

(19)

Assume the current assignment 119872 = 119901 1198970 not1198971 Then

there is a conflict between not1198971and 1198971 In all there are 3

deductions in the implication graph 119901 rarr 1198970forced by 119862

2

1198970

rarr not1198971by 1198620and 1198970

rarr 1198971by the theory lemma 119862

3

Among those reasons solid lines correspond to existingclauses in119865 while dashed lines correspond to theory lemmas1198623is actually a theory lemma 119897

0rarr 1198971(119909 = 119910 rarr 119891(119909) =

119891(119910))In the clause learning procedure we start from the

conflict (not(1198971

and not1198971) = not119897

1or 1198971) and trace back (apply a

sequence of resolutions) If the 1st UIP schema [4] is usedit is possible to learn not119901 or not119897

0(corresponds to cut

1and

cut2 resp) The resolution steps for learning not119901 are shown

in Figure 3 Actually not119901 is the resolvent of 1198623 1198620 1198622 which

are exactly the reasons labeled on the edges from the conflictto the cut Among those 119862

0 1198622are normal clauses and 119862

3is

a theory lemmaBased on the above discussions the related rules can be

interpreted more precisely as follows

1198623 not1198970 or 1198971not1198970

1198971

1198970not119901

1198622 not119901 or 11989701198620 not1198970 or not1198971

Figure 3 Resolution steps for learning not119901

(i) TheoryPropagate

119872 119865 oplus 119875 997891997888rarr 119872 119897(119862)

119865 oplus 119875 (20)

The precondition requires that 119872⊨T119897 Let 1198721015840 be a subset of

119872 which entails 119897 (possibly 1198721015840

= 119872) Let 119862 = not1198721015840

or 119897 then⊨T119862 Furthermore 119872 119862⊨T119897 which means 119862 is a unit clauseunder the assignment 119872 Thus 119862 is the reason of 119897

(ii) UnitPropagate

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897(119862or119897)

119865 119862 or 119897 oplus 119875 (21)

The precondition requires that 119872 ⊨ not119862 thus 119862 or 119897 is a unitclause under 119872 So the reason of 119897 is 119862 or 119897

(iii) T-Backjump

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840

(1198621015840or1198971015840)

119865 119862 oplus (119875 ⊲ Res (119877))

(22)

The precondition requires that there exits some clause 1198621015840

or 1198971015840

such that 119865 119862⊨T1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840 The clause 1198621015840

or 1198971015840 is

then used as the reason for 1198971015840 As the clause learning is always

applicable [9] this clause always exists If the assignment119872 119897

d119873 is T-consistent then there must be some conflict

clause 119862 isin 119865 As explained above the clause learning willthen be applied and the learned clause will be 119862

1015840or 1198971015840 On the

other hand when 119872 119897d119873 is T-inconsistent clause learning

is also applicable in the generalized implication graph andgenerates a candidate 119862

1015840or 1198971015840 In both cases 119862

1015840or 1198971015840 can be

learned by applying linear regular resolutions on a clauseset 119877 [4] Moreover such 119877 is the union of a subset of reasonsin 119872 119897

d119873 and a group of theory lemmas

(iv) Fail

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (23)

The Fail rule is similar to T-Backjump except that the learnedclause is the empty clause

43 Properties of CDPLL(T) The soundness and complete-ness of CDPLL(T) are proven in this subsection

Lemma11 Awell-formed certificatemodified byT-BackjumpT-Learn(I) T-Learn(II) Fail or T-Forget is still wellformed

Proof Each of the 4 rules appends a new item to the cer-tificate Assume that the certificate 119875 is modified to 119875

1015840=

119875 ⊲ 119868 where 119868 is the appended certificate item The well-formed property of 119875

1015840 is checked by case studying the rulesapplied

8 Journal of Applied Mathematics

(i) For T-Backjump a certificate item of resolution typeis appended According to the precondition thecertificate items referred by 119868 are already defined inlfloor119875rfloor So 119875

1015840 is still well formed

(ii) For T-Learn(I) similar to the previous case The de-pendency of appended certificate item is satisfied

(iii) For T-Learn(II) a theory lemma item 119862 is appendedIt is required that ⊨T119862 Thus 119875

1015840 is still well formed

(iv) For Fail similar to the T-Backjump case

(v) For T-Forget the rule requires that the forgottenclause is in the clause set By Lemma 9 we know thatthere is a certificate item in 119875 which defines 119862 Thusthe well formedness is ensured

With the help of this lemma it can be proven thatCDPLL(T) procedure always generates well-formed certifi-cates

Lemma 12 Given any reachable state 119872 119865 oplus 119875 in anyCDPLL(T) procedure 119875 is a well-formed certificate

Proof First of all the initial certificate 1198750is well formed

because it only contains initial items Secondly the certificateis only modified in T-Backjump T-Learn(I) T-Learn(II)Fail and T-Forget By Lemma 11 these rules will preservewell formedness So following any path of CDPLL(T) willalways generate a well-formed certificate That completes theproof

Theorem 13 (soundness) Given a clause set 1198650 for any

certificate 119875 if 119872119909

119865119909

oplus 119875119909is reachable in a CDPLL(T)

procedure then 119872119909

119865119909is also reachable in a DPLL(T)

procedure Specially if ⟨Fail⟩ oplus 119875 is reachable in CDPLL(T)then ⟨Fail⟩ is also reachable in DPLL(T)

Proof For each transition step in CDPLL(T)

119872 119865 oplus 119875997888rarrCDPLL(T)1198721015840

1198651015840

oplus 1198751015840 (24)

it holds that

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (25)

So given a CDPLL(T) trace a trace in DPLL(T) can beobtained by removing the certificate part If 119872

119909 119865119909

oplus 119875119909

is reachable in CDPLL(T) so is 119872119909

119865119909in DPLL(T)

Theorem 14 (completeness) Given a clause set 1198650 if the state

119872119909

119865119909is reachable in a DPLL(T) procedure then there

must be a certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable

in a DPLL(T) procedure Furthermore if ⟨Fail⟩ is reachablein a DPLL(T) procedure then ⟨Fail⟩ oplus 119875

119909is reachable in a

CDPLL(T) and ◻ isin lfloor119875119909rfloor

Proof For rules Decide TheoryPropagate UnitPropagate T-Learn(II) T-Forget and Restart the preconditions are equalto those in CDPLL(T) So if there is a transition in DPLL(T)

labelled with one of these rules there could also be the sametransition in CDPLL(T)

For other rules T-Backjump T-Learn(I) and Fail weneed to prove that if 119872

119909 119865119909is reachable then there is some

certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable We prove

that by induction Initially this condition holds because theinitial state 119872

0 1198650corresponds to 119872

0 1198650

oplus 1198750 Inductively

if

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (26)

is a possible transition in DPLL(T) and there is some 119875 suchthat 119872 119865 oplus 119875 is reachable in CDPLL(T)

(i) If it is labelled with T-Learn(I) because the learnedclause is from clause learning and clause learningcorresponds to linear regular resolution It is alwayspossible that necessary theory lemmas in the impli-cation graph are learned at first by Learn(II) and getto a state 119872 119865 oplus 119875

10158401015840 on which the preconditions ofLearn(I) are satisfied Then 119872

1015840 1198651015840

oplus 1198751015840 is reached

(ii) If it is labelled with T-Backjump or Fail the situationis similar

Our approach can be well adapted to any DPLL(T)-basedSMT solver It is sound and complete regardless of the theorylearning scheme used or the order of the decision procedureIt works well as long as clause learning is based on thegeneralized implication graph

44 Examples A couple of examples are discussed in thissubsection The first example is given as a CNF formulaon propositional variables In this case the SMT problemreduces to an SAT problem No theory deduction is involvedin this example so we can get a general idea of the structureof certificates

Example 1 Consider the following clause set

1198621

= not1198971

or not1198972

or not1198973

1198622

= not1198971

or not1198972

or 1198973

1198623

= not1198971

or 1198972

1198624

= 1198971

or not1198972

1198625

= 1198971

or 1198972

(27)

Let 1198650

= 1198621 1198622 1198623 1198624 1198625 assume that 119875

0defines

everything in 1198650as initial clause then a possible CDPLL(T)

procedure is shown in Table 2 where ldquosimrdquo means this field isthe same as that in the previous row

In step 4 1198622

= not1198971

or not1198972

or 1198973is falsified By analysing

the implication graph a clause is learned from 1198621 1198622 1198623

Amonglsquothe referred clauses1198622is a falsified clause and119862

1 1198623

are in the reasons In step 7 1198624is falsified but there is no

decision variable By analysing the implication graph we canfind a resolution from 119862

4 1198625 1198626to the empty clause Among

the referred clauses 1198624is a falsified clause and 119862

5 1198626are in

the reasons

Journal of Applied Mathematics 9

Table 2 The CDPLL(T) procedure for Example 1

No 119872 119865 oplusP Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 119897

d1

10038171003817100381710038171003817sim oplus sim 0 Decide

3 119897d1 1198972

10038171003817100381710038171003817sim oplus sim 119897

2997891rarr 1198623 UnitPropagate

4 119897d1 1198972 not1198973

10038171003817100381710038171003817sim oplus sim

1198972

997891rarr 1198623

not1198973

997891rarr 1198621

UnitPropagate

now 1198622is falsified

5 not1198971

1003817100381710038171003817

1198650

1198626

= not1198971

oplus [1198750

Res1198621 1198622 1198623 = 119862

6

] not1198971

997891rarr 1198626

1198626

= not1198971is learned

T-Backjump

6 not1198971 not1198972

1003817100381710038171003817 sim oplus simnot1198971

997891rarr 1198626

not1198972

997891rarr 1198624

UnitPropagate

7 ⟨Fail⟩ sim oplus

[[[

[

1198750

Res1198621 1198622 1198623 = 119862

6

Res1198624 1198625 1198626 = ◻

]]]

]

0 Fail

Example 2 Consider the background theory TEUF and aclause set 119865 containing the following

(1198621) 119886 = 119887 or 119892 (119886) = 119892 (119887)

(1198622) ℎ (119886) = ℎ (119888) or 119901

(1198623) 119892 (119886) = 119892 (119887) or not119901

(1198624) ℎ (119886) = ℎ (119888) or 119886 = 119888

(1198625) 119891 (119886) = 119891 (119887)

(1198626) 119887 = 119888

(28)

Its boolean structure is

1198621

= 1198971

or not1198975

1198622

= not1198976

or 119901

1198623

= 1198975

or not119901

1198624

= 1198976

or 1198972

1198625

= not1198974

1198626

= 1198973

(29)

where

1198971 119886 = 119887

1198972 119886 = 119888

1198973 119887 = 119888

1198974 119891 (119886) = 119891 (119887)

1198975 119892 (119886) = 119892 (119887)

1198976 ℎ (119886) = ℎ (119888)

(30)

A possible CDPLL(T) procedure is shown in Table 3where the referred certificates are as follows

1198751

= [1198750

Lemma (not1198971

or 1198974) = 1198627

]

1198752

= [

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

]

]

1198753

=

[[[

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

Res 1198628 1198627 1198624 1198626 1198622 1198623 1198621 = ◻

]]]

]

(31)

In step 4 119872 becomes T-inconsistent The generalizedimplication graph is shown in Figure 4 A theory lemma119862

7=

not1198971

or 1198974is learned (instantiated the monotonicity property of

the equality) firstly then the backjump rule is applicable Insteps 11 and 12 119872 becomes T-inconsistent again A theorylemma 119862

8= 1198972

and 1198973

rarr 1198971is then learned (transitivity

of equality) and then clause learning is performed whichlearns the empty clause The implication graph is shown inFigure 5

5 The Certificate Checking

Lemma 15 For any initial clause set 1198650 given a well-formed

certificate 119875 and 0 le 119894 le |119875| if 119875[119894] is a certificate item thatdefines a clause then 119865

0⊨T⟦119875[119894]⟧

Proof We prove that by induction The base case is easy toprove because for any initial item 119875[119894] that defines a clause119875[119894] isin 119865

0 Therefore it is trivial that 119865

0⊨ ⟦119875[119894]⟧ Thus

1198650⊨T⟦119875[119894]⟧

(i) For resolution certificate items assume that 119875[119894] =

Res1198621 1198622 119862

119899 By definition all 119862

119894are all

defined in lfloor119875rfloor119894minus1

Then by the inductive hypothesis

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

2 Journal of Applied Mathematics

SMT solverbased

Certificate generation

Certificate CNFCertificate items

Certificate checker

Lemmachecker

Lemmachecker middot middot middot Lemma

checker

DPLL(T)

Figure 1 The road map of our approach

verifying a solver If the certificates are presented in a properformat its checking algorithm could have low complexity Forinstance linear time is required to check certain certificatesfor unsatisfiable SAT problems The certificate approach issolver independent We need not to look into the solvers theonly requirement for solvers is to generate certificates If thecertificate format is unified the certificate checking can be auniform procedure This benefit is vital since we need not todevelop a certificate checker for each SMT solver

There are some SMT solvers that support certificategeneration such as CVC3 [2] and Z3 [3] However their cer-tificate formats are quite different As a result the generationand checking procedures are tool specific Although SMTsolvers differ in the algorithms we believe that their formatsof certificates could be unified The basic idea comes fromSAT In an SAT community people consider to generate cer-tificates along with the DPLL framework and the certificates(for unsatisfiability instances) are unified as chains of linearregular resolutions which can lead from the initial clauses toan empty clause [4 5] With this uniform format the gen-eration and checking procedures for certificates can also beunified

This paper considers the DPLL(T) framework which isextended from DPLL and has been widely used in state-of-the-art SMT solvers A uniform format of SMT certificate isdefined in this paperThe roadmap of our approach is shownin Figure 1 Note that the size of certificate can be exponentialor even largerwith respect to the input problem [6] To reducestorage space the certificate format is defined carefully ina concise way Upon this format a unified procedure forcertificates generation is proposed This procedure can beeasily integrated intoDPLL(T)-based SMT solversMoreovera certificate checking procedure is also proposed in this paperThe checking procedure is easy fast and memory saving Tomake it extensible theory lemmas are checked by individuallemma checkers Experimental results show that the averageoverhead for the certificates generation is about 10

12 Related Work Propositional satisfiability problem (SAT)is a well-known decision problem It was proven to be NP-complete by Cook [7] in 1971 All known algorithms for SAThave exponential complexity State-of-the-art SAT solvers arebased on DPLL [8] algorithm The basic idea is to exploreall valuations by depth-first search in a branch and boundmanner To reduce unnecessary search efforts an optimi-zation called clause learning is used which learns a clausewhenever it recovers from a conflictThis optimization signif-icantly increases the performance andmakes SAT algorithms

practical During the clause learning process each learnedclause is a resolvent of linear regular resolution from existingclauses [4]

SMT is an extension of SAT As the name implies theinput formula may contain theory-specific predicates andfunctions For instance while only boolean variables areallowed in SAT 119909 ge 119910 and 119910 ge 119891(119886) + 119909 or 119909 lt 0 is a well-formedSMT formula The background theory T of SMT is usuallya composed theory from several individual theories In thispaper we focus on quantifier-free SMT problems for whichthe popular algorithm is DPLL(T) [9] The DPLL(T) algo-rithm is extended from DPLL Since the DPLL(T) algorithmis described as a transition system solving an SMT problemis actually seeking a path to the success or fail state whilerespecting the guard conditions on transitions Althoughthere are other heuristics and optimizations currently it isalmost standard to develop SMT solvers based on DPLL(T)

Certificates are informally a set of evidence that can beused to construct a rigorous proof for the judgment Forsatisfiable cases the certificate could be a model What ismore interesting is the certificates for unsatisfiable casesDeduction steps from the input problem to a conflict shouldbe reproducible by referring to the certificate In this paperwhen saying certificates we refer to those for unsatisfiablecases

Many of the state-of-the-art SAT solvers can generatecertificates but their logical formats are very distinct fromeach other For example the certificate generated by zChaff[10] and MiniSat [11] are quite different PicoSAT [12] gen-erates concise and well-defined certificates A tool calledTraceCheck can do certificate checking [12] In PicoSATrsquos cer-tificates there are initial clauses and resolvents Resolvents areobtained from linear regular resolution If all the clauses forma directional acyclic graph and the empty clause is derivedthen it implies that the initial clause set is unsatisfiable Asshown in [4 5] clause learning in DPLL corresponds to alinear regular resolution So this certificate format is alsoadoptable by other DPLL-based solvers

For general SMT problems quantifiers may be used Fur-thermore theory-specific inference rules should be consid-ered when deciding their satisfiability Various logical frame-works are developed to provide flexible languages to writetheir proof such reference could be found in [13ndash16] Ifwe focus on quantifier-free problems the proof format andproof checking could be simplified In practice solvers suchas CVC3 and Z3 can generate certificates but their formatsare also quite different because they use different proofrules Proof rules are a set of inference rules with respect to

Journal of Applied Mathematics 3

the background theory T To be better suited for certificategeneration and checking the certificate format should notdepend heavily on theory-specific proof rules

The overhead for generating certificates should be assmall as possible An approach with 30 overhead is pre-sented in [17] which uses the Edinburgh Logical Frameworkwith Side Condition (LFSC) In our approach we focus onquantifier-free problems so that the overhead is further re-duced Certificate generation is related to unsatisfiable corecomputation In [18] a simple and flexible way of computingsmall unsatisfiable core in SMT was proposedThey computeby the propositional abstraction of SMT problem and invokean existing unsatisfiable core extractor for SAT Althoughthe idea might be similar unsatisfiable core computationconcerns which clauses lead to unsatisfiability whereas cer-tificate generation concerns how It is convenient to generateunsatisfiable core from certificates but not vice versa

To check certificates people can either check the certifi-cate directly just as in the SAT case or translate them intoinputs for some theorem provers such as HOL Light [19] orIsabelleHOL [20] There is also research in checking certif-icates using simple term rewriting [21] A known fact is thatif certificates are checked by translating them to proof itemsfor theorem provers it could take much longer time thancertificate finding [20]

13 Contribution In thispaper wepresent a certificate frame-work for quantifier-free formulas Compared to other worksour approach has the following advantages

(i) The certificate format is simple clean and extensibleto different underlying theories

(ii) The certificate generation procedure is well adaptedto most DPLL(T)-based SMT solvers We also imple-ment it in our solver

(iii) On average the certificate generation induces about10 overhead which is much less than other ap-proaches

(iv) The certificate checking procedure is simplified It hasbetter performance

The rest of this paper is organized as follows In Section 2we recall the original DPLL(T) algorithm The certificateformat is defined in Section 3 The framework of DPLL(T) +certificate is described in Section 4 where the formal rulesare defined in Section 41 necessary implementation issuesare discussed in Section 42 the properties are discussedin Section 43 and a couple of examples are shown inSection 44 Section 5 discusses the certificate checking mat-ters Experimentsrsquo results are shown in Section 6 FinallySection 7 concludes the paper

2 The DPLL(T) Algorithm

The DPLL(T) algorithm was proposed in [9] The input isa quantifier-free first order formula in conjunctive normalform (CNF) from the background theory T DPLL(T) wasgiven as a transition system Most features and optimizationsof SMT algorithms can be formalized in that way

In first order logic any CNF formula can be viewed as aset of clauses Each clause is disjunction of literals and eachliteral is a positive or negative form of an atom A formulais satisfiable if there exists an interpretation under which theformula evaluates to true

In DPLL(T) formula satisfiability is considered in somebackground theory T (all interpreted symbols should befrom T) We use ldquo⊨Trdquo to denote theory entailment in TGiven sets of formulas Σ and Γ if any interpretation thatsatisfies all formulas in Σ also satisfies all formulas Γ we writeΣ⊨TΓ Similarly ldquo⊨rdquo denotes propositional entailment whichconsider each atomic formula syntactically as a propositionalliteral

Each state is a 2-tuple ldquo119872 119865rdquo where 119872 is a stack ofthe currently asserted literals and 119865 the set of clauses to besatisfied The transition relations are given by the followingrules

(i) Decide

Precondition

(1) 119897 or not119897 occurs in a clause of 119865

(2) 119897 is undefined in 119872

119872 119865 997891997888rarr 119872 119897d

119865 (1)

(ii) UnitPropagate

Precondition

(1) 119872 ⊨ not119862

(2) 119897 is undefined in 119872

119872 119865 119862 or 119897 997891997888rarr 119872 119897 119865 119862 or 119897 (2)

(iii) TheoryPropagate

Precondition

(1) 119872⊨T119897

(2) 119897 or not119897 occurs in 119865

(3) 119897 is undefined in 119872

119872 119865 997891997888rarr 119872 119897 119865 (3)

(iv) T-Backjump

Precondition

(1) 119872 119897d119873 ⊨ not119862

(2) there is some clause 1198621015840

or 1198971015840 such that

(a) 119865 119862⊨T 1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840(b) 1198971015840 is undefined in 119872

(c) 1198971015840 or not119897

1015840 occurs in 119865 or in 119872 119897d119873

119872 119897d119873 119865 119862 997891997888rarr 119872 119897

1015840 119865 119862 (4)

4 Journal of Applied Mathematics

(v) T-Learn

Precondition(1) each atom of 119862 occurs in 119865 or in 119872(2) 119865⊨T119862

119872 119865 997891997888rarr 119872 119865 119862 (5)

(vi) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 997891997888rarr 119872 119865 (6)

(vii) Restart

Precondition

T

119872 119865 997891997888rarr 0 119865 (7)

(viii) Fail

Precondition(1) 119872 ⊨ not119862(2) 119872 contains no decision literals

119872 119865 119862 997891997888rarr ⟨Fail⟩ (8)

In DPLL(T) algorithm there is an assumption that thereexists a T-solver that can check the consistency of conjunc-tions of literals given in T This algorithm will terminateunder weak assumption and is proven sound and complete[9]

3 The Certificate Format

If a formula is satisfiable the certificate is simply an inter-pretation under which the formula evaluates to true On theother hand if it is unsatisfiable the certificate could be muchmore complicated It should be proven that ldquoall attempts tofind amodel failedrdquoMore precisely each branch of the searchtreemust be tried Technically speaking this closed search treecan be presented as a refutation procedure which containssequences of resolution procedures that produce the emptyclause [12] It is imaginable that if the search tree is quite largeso is the set of refutation clauses

Remember that in first order logic (FOL) constants areconsidered as nullary functions Then a term is either avariable or a function with arguments which are also termsFor example 119909 119891(119909 119910) and 120601(119904 120595(119905)) are valid terms Anatom (or atomic formula) is a predicate with arguments beingterms Note that nullary predicates are actually propositionalvariables

Definition 1 (T-atom T-literal) Given a background theoryT a T-atom is an atom whose functions and predicates are inthe signature of T A T-literal is the positive or negative formof a T-atom

For example given T the theory of Equality with Unin-terpreted Functions if 119901 is a propositional variable then 119901119909 = 119891(119910) are T-atoms and 119901 not119901 119909 = 119891(119910) and 119909 = 119891(119910) areT-literals

A proof rule is an inference rule which has several T-literals as premises and one T-literal as the conclusion Forproof rules whose conclusion is conjunction of 119899 T-literalswe can split it to 119899 proof rules and one for each) For example119909 = 119910 implies that 119891(119909) = 119891(119910) is a proof rule (themonotonicity of equality)This rule holds for all T-terms 119909 119910

and all function 119891 Each proof rule can be instantiated to atheory lemma For instance if 119904 119905 are two specific T-termsand 119892 is a specific function then (119904 = 119905) rarr (119892(119904) = 119892(119905)) is atheory lemma

Definition 2 (clause item) A clause item is one of the follow-ing three forms

(i) Init 1198971

or sdot sdot sdot or 119897119899 an initial clause 119897

1or sdot sdot sdot or 119897

119899 where all

119897119894are T-literals

(ii) Res 1198621 119862

119899 a clause obtained from a linear

resolution of1198621 119862

119899 where all119862

119894are clauses items

(iii) Lemma 1198971

and sdot sdot sdot and 119897119899

rarr 1198971015840 a theory lemma where all

119897119894and 1198971015840 are T-literals The defined clause is not119897

1or sdot sdot sdot or

not119897119899

or 1198971015840

A resolution chain 1198621 1198622 119862

119899is called a linear reg-

ular resolution chain if there exists a sequence of clauses1198631 1198632 119863

119899minus1such that (1) 119863

1is the resolvent of 119862

1and

1198622 (2) for 2 le 119894 lt 119899 119863

119894is the resolvent of 119863

119894minus1and 119862

119894+1 (3)

all resolutions are applied on different T-atomsA clause item gives the syntax form for a clause For

each clause item 119862 if it is well defined the concrete formof 119862 (given as disjunction of T-literals) can be calculatedIt is called a concrete clause denoted by ⟦119862⟧ We do notdistinguish 119862 and ⟦119862⟧ when no ambiguity is caused

Definition 3 (certificate item) A certificate item is either ofthe following

(i) Define ClauseItem

(ii) Forget ClauseItem

Definition 4 (certificate) A certificate 119875 with respect to anSMT instance is a sequence of certificate items The size of 119875denoted by |119875| is the number of certificate items containedin 119875 119875[119894] is the 119894th element in 119875 for 119894 = 1 2 |119875|

Definition 5 (context) Given a certificate119875 and a number 0 le

119894 le |119875| the context with respect to 119894 denoted by lfloor119875rfloor is a setof clause items such that

lfloor119875rfloor119894

=

0 if 119894 = 0

lfloor119875rfloor119894minus1

cup 119862 if 119894 gt 0 and 119875 [119894 minus 1] is Define119862

lfloor119875rfloor119894minus1

119862 if 119894 gt 0 and 119875 [119894 minus 1] is Forget119862

(9)

Specially denote lfloor119875rfloor|119875|

by lfloor119875rfloor

Journal of Applied Mathematics 5

Table 1 A certificate example

Clause item ⟦119862⟧ Lemma1198621Define Init not119897

1or not1198972

not1198971

or not1198972

1198622Define Init 119897

11198971

1198623Define Lemma 119897

1rarr 1198972

not1198971

or 1198972

The property of ldquo=rdquo1198624Define Res 119862

1 1198623 1198622

◻ The empty clause

Example 6 Assume that 1198971

119909 = 119910 1198972

119891(119909) = 119891(119910) and1198621

not1198971

or not1198972 1198622

1198971 A certificate for the unsatisfiability of

the clause set 1198621 1198622 is presented in Table 1 In this example

1198621 1198622 is intuitively unsatisfiable Note that 119862

2entails 119897

1

which implies 1198972in TEUF while 119862

1 1198622

⊨ not1198972

Definition 7 (well-formed certificate) A certificate 119875 is wellformed if for all 1 le 119894 le |119875|

(i) if 119875[119894] is a Define item then one of the followingconditions hold(a) it is of Init type(b) it is of Res 119862

1 119862

119899 type where 119862

119896isin lfloor119875rfloor

119894minus1

for all 1 le 119896 le 119899 and there should be a linearregular resolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧

to 119875[119894](c) it is of Lemma type and the concrete clause is

valid in theory T that is ⊨T⟦119875[119894]⟧(ii) otherwise 119875[119894] is a Forget item and the referred clause

must be defined in lfloor119875rfloor119894minus1

4 Certificate Generation with DPLL(T)

The certificate generation algorithm is described in this sec-tion We first describe the abstract rule and implementationissues then prove some important properties and finally givea couple of examples

41 The CDPLL(T) Algorithm We describe here a certifi-cate generation procedure that can be well adapted to theDPLL(T) algorithm The extension of DPLL(T) with certifi-cate generation is called CDPLL(T)

Definition 8 (certificate refinement) The certificate obtainedby appending a certificate item 119862 to the end of the certificate119875 is denoted by 119875 ⊲ 119862

We use a transition system to model the CDPLL(T)algorithm Each state is represented as a 3-tuple ldquo119872 119865 oplus 119875rdquowhere119872 is the literal stack119865 the clause set to be satisfied and119875 the current certificateThe transition rules inCDPLL(T) arefollows

(i) Decide

Precondition(1) 119897 or not119897 occurs in a clause of 119865(2) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897d

119865 oplus 119875 (10)

(ii) UnitPropagate

Precondition

(1) 119872 ⊨ not119862(2) 119897 is undefined in 119872

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897 119865 119862 or 119897 oplus 119875 (11)

(iii) TheoryPropagate

Precondition

(1) 119872⊨T119897(2) 119897 or not119897 occurs in 119865(3) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897 119865 oplus 119875 (12)

(iv) T-Backjump

Precondition

(1) 119872 119897d119873 ⊨ not119862

(2) there exists a clause 1198621015840or 1198971015840 such that (a) 119865 119862⊨T119862

1015840or 1198971015840

and 119872 ⊨ not1198621015840 (b) 119897

1015840 is undefined in 119872 and (c) 1198971015840 or

not1198971015840 occurs in 119865 or in 119872 119897

d119873

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

1015840or 1198971015840

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840 119865 119862 oplus (119875 ⊲ Res (119877)) (13)

(v) T-Learn(I) this rule models the clause learning proce-dure

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) 119865 ⊨ 119862

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Res (119877)) (14)

(vi) T-Learn(II) this rule models the learning of theorylemma

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) ⊨T119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Lemma (119862)) (15)

6 Journal of Applied Mathematics

(vii) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 oplus 119875 997891997888rarr 119872 119865 oplus (119875 ⊲ Forget (119862)) (16)

(viii) Restart

Precondition

T

119872 119865 oplus 119875 997891997888rarr 0 119865 oplus 119875 (17)

(ix) Fail

Precondition

(1) 119872 ⊨ not119862

(2) 119872 contains no decision literals

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to ◻

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (18)

The initial state is 119872 119865 oplus 1198750where 119875

0is the certificate

that defines all initial clauses When a final state ⟨Fail⟩ oplus 119875

is reached 119875 is the obtained certificate for the unsatisfiabilityjudgment

The major differences between CDPLL(T) and DPLL(T)are as follows (1) Certificate generation is added Notice themodifications to rules T-Backjump T-Learn T-Forget andT-Fail (2) In order to generate well-formed certificates theoriginal T-Learn rule is split into 2 cases one for the learningof resolvents (corresponds to clause learning) and the otherfor the learning of theory lemmas (corresponds to deductionsinT)The first three rules are unchanged except the certificatepart

42 Implementation of CDPLL(T) In a real implementationof CDPLL(T) additional information need to be collectedto construct the certificate items In this section we discussthese details

Lemma 9 For any reachable state 119872 119865 oplus 119875 in CDPLL(T)and any clause 119862 isin 119865 there exists a clause item 119863 isin lfloor119875rfloor suchthat ⟦119863⟧ = 119862

Proof We prove that by induction Initially this propertyholds because all clauses in 119865

0are initial clauses and 119875

0

consists of all initial clause items (by definition) At each validtransition step the clause set 119865 and the certificate 119875 may bemodified However each rule ensures that whenever a clause119862 is added to 119865 there is always a clause item 119863 such that⟦119863⟧ = 119862 added to 119875 for example in the rule T-Learn(I)Furthermore whenever a clause item is removed from 119875

the corresponding clause is removed in 119865 for example in therule T-Forget Thus the property holds on all valid trace ofCDPLL(T)

421 Record Reasons For any literal 119897 in 119872 it is either addedby the Decide rule or forced by a clause or theory lemmaFor the latter case we need to record a clause item which isa deduction of the lemma and from which the literal 119897 canbe enforced We call this item the reason for literal 119897 beingin 119872

Only rules UnitPropagate TheoryPropagate and T-Backjump can enforce new literal to 119872 We discuss in thefollowing the reasons recorded by these three rules

(i) UnitPropagate the reason is a certificate item 119863 suchthat ⟦119863⟧ = 119862 or 119897 isin 119865 By Lemma 9 such item alwaysexists

(ii) TheoryPropagate assume that 119872 = 1198971

and 1198972

and sdot sdot sdot and 119897119898

and119872⊨T119897 then the reason is a certificate item119863 suchthat ⟦119863⟧ = not119897

1or not1198972

or not sdot sdot sdot or not119897119898

or 119897(iii) T-Backjump the reason for 119897

1015840 is a certificate item119863 such that ⟦119863⟧ = 119862

1015840or 1198971015840 We will show in the

following that such certificate item always exists whenT-Backjump is applicable

For convenience a subscript is used to denote the reason forexample 119897

(119863)denotes the literal 119897 asserted by the reason ⟦119863⟧

422 Learn Clauses In rules T-Backjump and Fail oncethere is a conflict a backjump clause needs be learned Fur-thermore if there are some branches that have not beenexplored T-Backjump is applied otherwise Fail is appliedEach clause learning procedure introduces some new certifi-cate items

We use implication graph to analyze the clause learningprocess Remember that in DPLL all literals are propositionalatoms the implication graph is simply a direct acyclic graph(DAG) where each edge corresponds to a deduction step InCDPLL(T) the implication graph has two kinds of deduc-tions propositional deduction (which is similar to DPLL)and theory deduction Each deduction step in the implicationgraph is labelled with a reason 119862This graph can be viewed asa generalized implication graph Without causing ambiguitywe still call it an implication graph

In the implication graph of DPLL each cut containing theconflict literals corresponds to a learned clause (by resolvingall the associated clauses of edges in this cut) [4] No matterwhich criteria (eg 1-UIP) is used the rule for clause learn-ing is applicable For CDPLL(T) when considering the gen-eralized implication graphs the clause learning rule is alsoapplicable

There are two situations where T-Backjump can be ap-plied Given a state 119872 119865 oplus 119875 the first situation is thatthere exists a clause 119862 isin 119865 which is falsified by 119872 that is⊭ 119872 119862 Then 119862 is called the conflict clause As in the rulesome backjump clause 119862

1015840or 1198971015840 should be learned by analyzing

the conflict Usually1198621015840or1198971015840 is a resolvent of a linear resolution

A certificate item with resolution type will be learned

Journal of Applied Mathematics 7

not1198971

1198971

1198970119901

1198622

1198620

1198623

Cut1 Cut2

Figure 2 Example of a generalized implication graph

The other situation is that 119872 itself becomes T-inconsist-ent Then a subset 119897

1 1198972 119897119898

of 119872 is unsatisfiable that is⊨Tnot1198971

or not1198972

or sdot sdot sdot or not119897119898 This is also a conflict clause It may

not belong to 119865 but it can definitely be learned from 119865 alongwith some theory lemmas The fact that 119897

1 119897119898leads to

conflict is represented in the generalized implication graphFurthermore the clause 119862

1015840or 1198971015840 that is required by the rule T-

Backjump can also be learned by generalized conflict clauseanalysis

In both situations the backjump clauses are learned byclause learning The certificate for unsatisfiability is actuallya well-organized collection of these clause items If Fail isapplicable no literal in119872 is decision variable then the learntclause is the empty clause which shows 119865⊨T◻

Example 10 A generalized implication graph is shown inFigure 2 where the literals are as follows 119897

0 119909 = 119910 119897

1

119891(119909) = 119891(119910) and 119901 is a propositional literal Clauses are asfollows

1198620

not1198970

or not1198971

= not (119909 = 119910) or (119891 (119909) = 119891 (119910))

1198621

1198970

or 119901 = 119909 = 119910 or 119901

1198622

1198970

or not119901 = 119909 = 119910 or not119901

(19)

Assume the current assignment 119872 = 119901 1198970 not1198971 Then

there is a conflict between not1198971and 1198971 In all there are 3

deductions in the implication graph 119901 rarr 1198970forced by 119862

2

1198970

rarr not1198971by 1198620and 1198970

rarr 1198971by the theory lemma 119862

3

Among those reasons solid lines correspond to existingclauses in119865 while dashed lines correspond to theory lemmas1198623is actually a theory lemma 119897

0rarr 1198971(119909 = 119910 rarr 119891(119909) =

119891(119910))In the clause learning procedure we start from the

conflict (not(1198971

and not1198971) = not119897

1or 1198971) and trace back (apply a

sequence of resolutions) If the 1st UIP schema [4] is usedit is possible to learn not119901 or not119897

0(corresponds to cut

1and

cut2 resp) The resolution steps for learning not119901 are shown

in Figure 3 Actually not119901 is the resolvent of 1198623 1198620 1198622 which

are exactly the reasons labeled on the edges from the conflictto the cut Among those 119862

0 1198622are normal clauses and 119862

3is

a theory lemmaBased on the above discussions the related rules can be

interpreted more precisely as follows

1198623 not1198970 or 1198971not1198970

1198971

1198970not119901

1198622 not119901 or 11989701198620 not1198970 or not1198971

Figure 3 Resolution steps for learning not119901

(i) TheoryPropagate

119872 119865 oplus 119875 997891997888rarr 119872 119897(119862)

119865 oplus 119875 (20)

The precondition requires that 119872⊨T119897 Let 1198721015840 be a subset of

119872 which entails 119897 (possibly 1198721015840

= 119872) Let 119862 = not1198721015840

or 119897 then⊨T119862 Furthermore 119872 119862⊨T119897 which means 119862 is a unit clauseunder the assignment 119872 Thus 119862 is the reason of 119897

(ii) UnitPropagate

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897(119862or119897)

119865 119862 or 119897 oplus 119875 (21)

The precondition requires that 119872 ⊨ not119862 thus 119862 or 119897 is a unitclause under 119872 So the reason of 119897 is 119862 or 119897

(iii) T-Backjump

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840

(1198621015840or1198971015840)

119865 119862 oplus (119875 ⊲ Res (119877))

(22)

The precondition requires that there exits some clause 1198621015840

or 1198971015840

such that 119865 119862⊨T1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840 The clause 1198621015840

or 1198971015840 is

then used as the reason for 1198971015840 As the clause learning is always

applicable [9] this clause always exists If the assignment119872 119897

d119873 is T-consistent then there must be some conflict

clause 119862 isin 119865 As explained above the clause learning willthen be applied and the learned clause will be 119862

1015840or 1198971015840 On the

other hand when 119872 119897d119873 is T-inconsistent clause learning

is also applicable in the generalized implication graph andgenerates a candidate 119862

1015840or 1198971015840 In both cases 119862

1015840or 1198971015840 can be

learned by applying linear regular resolutions on a clauseset 119877 [4] Moreover such 119877 is the union of a subset of reasonsin 119872 119897

d119873 and a group of theory lemmas

(iv) Fail

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (23)

The Fail rule is similar to T-Backjump except that the learnedclause is the empty clause

43 Properties of CDPLL(T) The soundness and complete-ness of CDPLL(T) are proven in this subsection

Lemma11 Awell-formed certificatemodified byT-BackjumpT-Learn(I) T-Learn(II) Fail or T-Forget is still wellformed

Proof Each of the 4 rules appends a new item to the cer-tificate Assume that the certificate 119875 is modified to 119875

1015840=

119875 ⊲ 119868 where 119868 is the appended certificate item The well-formed property of 119875

1015840 is checked by case studying the rulesapplied

8 Journal of Applied Mathematics

(i) For T-Backjump a certificate item of resolution typeis appended According to the precondition thecertificate items referred by 119868 are already defined inlfloor119875rfloor So 119875

1015840 is still well formed

(ii) For T-Learn(I) similar to the previous case The de-pendency of appended certificate item is satisfied

(iii) For T-Learn(II) a theory lemma item 119862 is appendedIt is required that ⊨T119862 Thus 119875

1015840 is still well formed

(iv) For Fail similar to the T-Backjump case

(v) For T-Forget the rule requires that the forgottenclause is in the clause set By Lemma 9 we know thatthere is a certificate item in 119875 which defines 119862 Thusthe well formedness is ensured

With the help of this lemma it can be proven thatCDPLL(T) procedure always generates well-formed certifi-cates

Lemma 12 Given any reachable state 119872 119865 oplus 119875 in anyCDPLL(T) procedure 119875 is a well-formed certificate

Proof First of all the initial certificate 1198750is well formed

because it only contains initial items Secondly the certificateis only modified in T-Backjump T-Learn(I) T-Learn(II)Fail and T-Forget By Lemma 11 these rules will preservewell formedness So following any path of CDPLL(T) willalways generate a well-formed certificate That completes theproof

Theorem 13 (soundness) Given a clause set 1198650 for any

certificate 119875 if 119872119909

119865119909

oplus 119875119909is reachable in a CDPLL(T)

procedure then 119872119909

119865119909is also reachable in a DPLL(T)

procedure Specially if ⟨Fail⟩ oplus 119875 is reachable in CDPLL(T)then ⟨Fail⟩ is also reachable in DPLL(T)

Proof For each transition step in CDPLL(T)

119872 119865 oplus 119875997888rarrCDPLL(T)1198721015840

1198651015840

oplus 1198751015840 (24)

it holds that

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (25)

So given a CDPLL(T) trace a trace in DPLL(T) can beobtained by removing the certificate part If 119872

119909 119865119909

oplus 119875119909

is reachable in CDPLL(T) so is 119872119909

119865119909in DPLL(T)

Theorem 14 (completeness) Given a clause set 1198650 if the state

119872119909

119865119909is reachable in a DPLL(T) procedure then there

must be a certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable

in a DPLL(T) procedure Furthermore if ⟨Fail⟩ is reachablein a DPLL(T) procedure then ⟨Fail⟩ oplus 119875

119909is reachable in a

CDPLL(T) and ◻ isin lfloor119875119909rfloor

Proof For rules Decide TheoryPropagate UnitPropagate T-Learn(II) T-Forget and Restart the preconditions are equalto those in CDPLL(T) So if there is a transition in DPLL(T)

labelled with one of these rules there could also be the sametransition in CDPLL(T)

For other rules T-Backjump T-Learn(I) and Fail weneed to prove that if 119872

119909 119865119909is reachable then there is some

certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable We prove

that by induction Initially this condition holds because theinitial state 119872

0 1198650corresponds to 119872

0 1198650

oplus 1198750 Inductively

if

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (26)

is a possible transition in DPLL(T) and there is some 119875 suchthat 119872 119865 oplus 119875 is reachable in CDPLL(T)

(i) If it is labelled with T-Learn(I) because the learnedclause is from clause learning and clause learningcorresponds to linear regular resolution It is alwayspossible that necessary theory lemmas in the impli-cation graph are learned at first by Learn(II) and getto a state 119872 119865 oplus 119875

10158401015840 on which the preconditions ofLearn(I) are satisfied Then 119872

1015840 1198651015840

oplus 1198751015840 is reached

(ii) If it is labelled with T-Backjump or Fail the situationis similar

Our approach can be well adapted to any DPLL(T)-basedSMT solver It is sound and complete regardless of the theorylearning scheme used or the order of the decision procedureIt works well as long as clause learning is based on thegeneralized implication graph

44 Examples A couple of examples are discussed in thissubsection The first example is given as a CNF formulaon propositional variables In this case the SMT problemreduces to an SAT problem No theory deduction is involvedin this example so we can get a general idea of the structureof certificates

Example 1 Consider the following clause set

1198621

= not1198971

or not1198972

or not1198973

1198622

= not1198971

or not1198972

or 1198973

1198623

= not1198971

or 1198972

1198624

= 1198971

or not1198972

1198625

= 1198971

or 1198972

(27)

Let 1198650

= 1198621 1198622 1198623 1198624 1198625 assume that 119875

0defines

everything in 1198650as initial clause then a possible CDPLL(T)

procedure is shown in Table 2 where ldquosimrdquo means this field isthe same as that in the previous row

In step 4 1198622

= not1198971

or not1198972

or 1198973is falsified By analysing

the implication graph a clause is learned from 1198621 1198622 1198623

Amonglsquothe referred clauses1198622is a falsified clause and119862

1 1198623

are in the reasons In step 7 1198624is falsified but there is no

decision variable By analysing the implication graph we canfind a resolution from 119862

4 1198625 1198626to the empty clause Among

the referred clauses 1198624is a falsified clause and 119862

5 1198626are in

the reasons

Journal of Applied Mathematics 9

Table 2 The CDPLL(T) procedure for Example 1

No 119872 119865 oplusP Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 119897

d1

10038171003817100381710038171003817sim oplus sim 0 Decide

3 119897d1 1198972

10038171003817100381710038171003817sim oplus sim 119897

2997891rarr 1198623 UnitPropagate

4 119897d1 1198972 not1198973

10038171003817100381710038171003817sim oplus sim

1198972

997891rarr 1198623

not1198973

997891rarr 1198621

UnitPropagate

now 1198622is falsified

5 not1198971

1003817100381710038171003817

1198650

1198626

= not1198971

oplus [1198750

Res1198621 1198622 1198623 = 119862

6

] not1198971

997891rarr 1198626

1198626

= not1198971is learned

T-Backjump

6 not1198971 not1198972

1003817100381710038171003817 sim oplus simnot1198971

997891rarr 1198626

not1198972

997891rarr 1198624

UnitPropagate

7 ⟨Fail⟩ sim oplus

[[[

[

1198750

Res1198621 1198622 1198623 = 119862

6

Res1198624 1198625 1198626 = ◻

]]]

]

0 Fail

Example 2 Consider the background theory TEUF and aclause set 119865 containing the following

(1198621) 119886 = 119887 or 119892 (119886) = 119892 (119887)

(1198622) ℎ (119886) = ℎ (119888) or 119901

(1198623) 119892 (119886) = 119892 (119887) or not119901

(1198624) ℎ (119886) = ℎ (119888) or 119886 = 119888

(1198625) 119891 (119886) = 119891 (119887)

(1198626) 119887 = 119888

(28)

Its boolean structure is

1198621

= 1198971

or not1198975

1198622

= not1198976

or 119901

1198623

= 1198975

or not119901

1198624

= 1198976

or 1198972

1198625

= not1198974

1198626

= 1198973

(29)

where

1198971 119886 = 119887

1198972 119886 = 119888

1198973 119887 = 119888

1198974 119891 (119886) = 119891 (119887)

1198975 119892 (119886) = 119892 (119887)

1198976 ℎ (119886) = ℎ (119888)

(30)

A possible CDPLL(T) procedure is shown in Table 3where the referred certificates are as follows

1198751

= [1198750

Lemma (not1198971

or 1198974) = 1198627

]

1198752

= [

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

]

]

1198753

=

[[[

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

Res 1198628 1198627 1198624 1198626 1198622 1198623 1198621 = ◻

]]]

]

(31)

In step 4 119872 becomes T-inconsistent The generalizedimplication graph is shown in Figure 4 A theory lemma119862

7=

not1198971

or 1198974is learned (instantiated the monotonicity property of

the equality) firstly then the backjump rule is applicable Insteps 11 and 12 119872 becomes T-inconsistent again A theorylemma 119862

8= 1198972

and 1198973

rarr 1198971is then learned (transitivity

of equality) and then clause learning is performed whichlearns the empty clause The implication graph is shown inFigure 5

5 The Certificate Checking

Lemma 15 For any initial clause set 1198650 given a well-formed

certificate 119875 and 0 le 119894 le |119875| if 119875[119894] is a certificate item thatdefines a clause then 119865

0⊨T⟦119875[119894]⟧

Proof We prove that by induction The base case is easy toprove because for any initial item 119875[119894] that defines a clause119875[119894] isin 119865

0 Therefore it is trivial that 119865

0⊨ ⟦119875[119894]⟧ Thus

1198650⊨T⟦119875[119894]⟧

(i) For resolution certificate items assume that 119875[119894] =

Res1198621 1198622 119862

119899 By definition all 119862

119894are all

defined in lfloor119875rfloor119894minus1

Then by the inductive hypothesis

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 3

the background theory T To be better suited for certificategeneration and checking the certificate format should notdepend heavily on theory-specific proof rules

The overhead for generating certificates should be assmall as possible An approach with 30 overhead is pre-sented in [17] which uses the Edinburgh Logical Frameworkwith Side Condition (LFSC) In our approach we focus onquantifier-free problems so that the overhead is further re-duced Certificate generation is related to unsatisfiable corecomputation In [18] a simple and flexible way of computingsmall unsatisfiable core in SMT was proposedThey computeby the propositional abstraction of SMT problem and invokean existing unsatisfiable core extractor for SAT Althoughthe idea might be similar unsatisfiable core computationconcerns which clauses lead to unsatisfiability whereas cer-tificate generation concerns how It is convenient to generateunsatisfiable core from certificates but not vice versa

To check certificates people can either check the certifi-cate directly just as in the SAT case or translate them intoinputs for some theorem provers such as HOL Light [19] orIsabelleHOL [20] There is also research in checking certif-icates using simple term rewriting [21] A known fact is thatif certificates are checked by translating them to proof itemsfor theorem provers it could take much longer time thancertificate finding [20]

13 Contribution In thispaper wepresent a certificate frame-work for quantifier-free formulas Compared to other worksour approach has the following advantages

(i) The certificate format is simple clean and extensibleto different underlying theories

(ii) The certificate generation procedure is well adaptedto most DPLL(T)-based SMT solvers We also imple-ment it in our solver

(iii) On average the certificate generation induces about10 overhead which is much less than other ap-proaches

(iv) The certificate checking procedure is simplified It hasbetter performance

The rest of this paper is organized as follows In Section 2we recall the original DPLL(T) algorithm The certificateformat is defined in Section 3 The framework of DPLL(T) +certificate is described in Section 4 where the formal rulesare defined in Section 41 necessary implementation issuesare discussed in Section 42 the properties are discussedin Section 43 and a couple of examples are shown inSection 44 Section 5 discusses the certificate checking mat-ters Experimentsrsquo results are shown in Section 6 FinallySection 7 concludes the paper

2 The DPLL(T) Algorithm

The DPLL(T) algorithm was proposed in [9] The input isa quantifier-free first order formula in conjunctive normalform (CNF) from the background theory T DPLL(T) wasgiven as a transition system Most features and optimizationsof SMT algorithms can be formalized in that way

In first order logic any CNF formula can be viewed as aset of clauses Each clause is disjunction of literals and eachliteral is a positive or negative form of an atom A formulais satisfiable if there exists an interpretation under which theformula evaluates to true

In DPLL(T) formula satisfiability is considered in somebackground theory T (all interpreted symbols should befrom T) We use ldquo⊨Trdquo to denote theory entailment in TGiven sets of formulas Σ and Γ if any interpretation thatsatisfies all formulas in Σ also satisfies all formulas Γ we writeΣ⊨TΓ Similarly ldquo⊨rdquo denotes propositional entailment whichconsider each atomic formula syntactically as a propositionalliteral

Each state is a 2-tuple ldquo119872 119865rdquo where 119872 is a stack ofthe currently asserted literals and 119865 the set of clauses to besatisfied The transition relations are given by the followingrules

(i) Decide

Precondition

(1) 119897 or not119897 occurs in a clause of 119865

(2) 119897 is undefined in 119872

119872 119865 997891997888rarr 119872 119897d

119865 (1)

(ii) UnitPropagate

Precondition

(1) 119872 ⊨ not119862

(2) 119897 is undefined in 119872

119872 119865 119862 or 119897 997891997888rarr 119872 119897 119865 119862 or 119897 (2)

(iii) TheoryPropagate

Precondition

(1) 119872⊨T119897

(2) 119897 or not119897 occurs in 119865

(3) 119897 is undefined in 119872

119872 119865 997891997888rarr 119872 119897 119865 (3)

(iv) T-Backjump

Precondition

(1) 119872 119897d119873 ⊨ not119862

(2) there is some clause 1198621015840

or 1198971015840 such that

(a) 119865 119862⊨T 1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840(b) 1198971015840 is undefined in 119872

(c) 1198971015840 or not119897

1015840 occurs in 119865 or in 119872 119897d119873

119872 119897d119873 119865 119862 997891997888rarr 119872 119897

1015840 119865 119862 (4)

4 Journal of Applied Mathematics

(v) T-Learn

Precondition(1) each atom of 119862 occurs in 119865 or in 119872(2) 119865⊨T119862

119872 119865 997891997888rarr 119872 119865 119862 (5)

(vi) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 997891997888rarr 119872 119865 (6)

(vii) Restart

Precondition

T

119872 119865 997891997888rarr 0 119865 (7)

(viii) Fail

Precondition(1) 119872 ⊨ not119862(2) 119872 contains no decision literals

119872 119865 119862 997891997888rarr ⟨Fail⟩ (8)

In DPLL(T) algorithm there is an assumption that thereexists a T-solver that can check the consistency of conjunc-tions of literals given in T This algorithm will terminateunder weak assumption and is proven sound and complete[9]

3 The Certificate Format

If a formula is satisfiable the certificate is simply an inter-pretation under which the formula evaluates to true On theother hand if it is unsatisfiable the certificate could be muchmore complicated It should be proven that ldquoall attempts tofind amodel failedrdquoMore precisely each branch of the searchtreemust be tried Technically speaking this closed search treecan be presented as a refutation procedure which containssequences of resolution procedures that produce the emptyclause [12] It is imaginable that if the search tree is quite largeso is the set of refutation clauses

Remember that in first order logic (FOL) constants areconsidered as nullary functions Then a term is either avariable or a function with arguments which are also termsFor example 119909 119891(119909 119910) and 120601(119904 120595(119905)) are valid terms Anatom (or atomic formula) is a predicate with arguments beingterms Note that nullary predicates are actually propositionalvariables

Definition 1 (T-atom T-literal) Given a background theoryT a T-atom is an atom whose functions and predicates are inthe signature of T A T-literal is the positive or negative formof a T-atom

For example given T the theory of Equality with Unin-terpreted Functions if 119901 is a propositional variable then 119901119909 = 119891(119910) are T-atoms and 119901 not119901 119909 = 119891(119910) and 119909 = 119891(119910) areT-literals

A proof rule is an inference rule which has several T-literals as premises and one T-literal as the conclusion Forproof rules whose conclusion is conjunction of 119899 T-literalswe can split it to 119899 proof rules and one for each) For example119909 = 119910 implies that 119891(119909) = 119891(119910) is a proof rule (themonotonicity of equality)This rule holds for all T-terms 119909 119910

and all function 119891 Each proof rule can be instantiated to atheory lemma For instance if 119904 119905 are two specific T-termsand 119892 is a specific function then (119904 = 119905) rarr (119892(119904) = 119892(119905)) is atheory lemma

Definition 2 (clause item) A clause item is one of the follow-ing three forms

(i) Init 1198971

or sdot sdot sdot or 119897119899 an initial clause 119897

1or sdot sdot sdot or 119897

119899 where all

119897119894are T-literals

(ii) Res 1198621 119862

119899 a clause obtained from a linear

resolution of1198621 119862

119899 where all119862

119894are clauses items

(iii) Lemma 1198971

and sdot sdot sdot and 119897119899

rarr 1198971015840 a theory lemma where all

119897119894and 1198971015840 are T-literals The defined clause is not119897

1or sdot sdot sdot or

not119897119899

or 1198971015840

A resolution chain 1198621 1198622 119862

119899is called a linear reg-

ular resolution chain if there exists a sequence of clauses1198631 1198632 119863

119899minus1such that (1) 119863

1is the resolvent of 119862

1and

1198622 (2) for 2 le 119894 lt 119899 119863

119894is the resolvent of 119863

119894minus1and 119862

119894+1 (3)

all resolutions are applied on different T-atomsA clause item gives the syntax form for a clause For

each clause item 119862 if it is well defined the concrete formof 119862 (given as disjunction of T-literals) can be calculatedIt is called a concrete clause denoted by ⟦119862⟧ We do notdistinguish 119862 and ⟦119862⟧ when no ambiguity is caused

Definition 3 (certificate item) A certificate item is either ofthe following

(i) Define ClauseItem

(ii) Forget ClauseItem

Definition 4 (certificate) A certificate 119875 with respect to anSMT instance is a sequence of certificate items The size of 119875denoted by |119875| is the number of certificate items containedin 119875 119875[119894] is the 119894th element in 119875 for 119894 = 1 2 |119875|

Definition 5 (context) Given a certificate119875 and a number 0 le

119894 le |119875| the context with respect to 119894 denoted by lfloor119875rfloor is a setof clause items such that

lfloor119875rfloor119894

=

0 if 119894 = 0

lfloor119875rfloor119894minus1

cup 119862 if 119894 gt 0 and 119875 [119894 minus 1] is Define119862

lfloor119875rfloor119894minus1

119862 if 119894 gt 0 and 119875 [119894 minus 1] is Forget119862

(9)

Specially denote lfloor119875rfloor|119875|

by lfloor119875rfloor

Journal of Applied Mathematics 5

Table 1 A certificate example

Clause item ⟦119862⟧ Lemma1198621Define Init not119897

1or not1198972

not1198971

or not1198972

1198622Define Init 119897

11198971

1198623Define Lemma 119897

1rarr 1198972

not1198971

or 1198972

The property of ldquo=rdquo1198624Define Res 119862

1 1198623 1198622

◻ The empty clause

Example 6 Assume that 1198971

119909 = 119910 1198972

119891(119909) = 119891(119910) and1198621

not1198971

or not1198972 1198622

1198971 A certificate for the unsatisfiability of

the clause set 1198621 1198622 is presented in Table 1 In this example

1198621 1198622 is intuitively unsatisfiable Note that 119862

2entails 119897

1

which implies 1198972in TEUF while 119862

1 1198622

⊨ not1198972

Definition 7 (well-formed certificate) A certificate 119875 is wellformed if for all 1 le 119894 le |119875|

(i) if 119875[119894] is a Define item then one of the followingconditions hold(a) it is of Init type(b) it is of Res 119862

1 119862

119899 type where 119862

119896isin lfloor119875rfloor

119894minus1

for all 1 le 119896 le 119899 and there should be a linearregular resolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧

to 119875[119894](c) it is of Lemma type and the concrete clause is

valid in theory T that is ⊨T⟦119875[119894]⟧(ii) otherwise 119875[119894] is a Forget item and the referred clause

must be defined in lfloor119875rfloor119894minus1

4 Certificate Generation with DPLL(T)

The certificate generation algorithm is described in this sec-tion We first describe the abstract rule and implementationissues then prove some important properties and finally givea couple of examples

41 The CDPLL(T) Algorithm We describe here a certifi-cate generation procedure that can be well adapted to theDPLL(T) algorithm The extension of DPLL(T) with certifi-cate generation is called CDPLL(T)

Definition 8 (certificate refinement) The certificate obtainedby appending a certificate item 119862 to the end of the certificate119875 is denoted by 119875 ⊲ 119862

We use a transition system to model the CDPLL(T)algorithm Each state is represented as a 3-tuple ldquo119872 119865 oplus 119875rdquowhere119872 is the literal stack119865 the clause set to be satisfied and119875 the current certificateThe transition rules inCDPLL(T) arefollows

(i) Decide

Precondition(1) 119897 or not119897 occurs in a clause of 119865(2) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897d

119865 oplus 119875 (10)

(ii) UnitPropagate

Precondition

(1) 119872 ⊨ not119862(2) 119897 is undefined in 119872

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897 119865 119862 or 119897 oplus 119875 (11)

(iii) TheoryPropagate

Precondition

(1) 119872⊨T119897(2) 119897 or not119897 occurs in 119865(3) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897 119865 oplus 119875 (12)

(iv) T-Backjump

Precondition

(1) 119872 119897d119873 ⊨ not119862

(2) there exists a clause 1198621015840or 1198971015840 such that (a) 119865 119862⊨T119862

1015840or 1198971015840

and 119872 ⊨ not1198621015840 (b) 119897

1015840 is undefined in 119872 and (c) 1198971015840 or

not1198971015840 occurs in 119865 or in 119872 119897

d119873

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

1015840or 1198971015840

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840 119865 119862 oplus (119875 ⊲ Res (119877)) (13)

(v) T-Learn(I) this rule models the clause learning proce-dure

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) 119865 ⊨ 119862

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Res (119877)) (14)

(vi) T-Learn(II) this rule models the learning of theorylemma

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) ⊨T119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Lemma (119862)) (15)

6 Journal of Applied Mathematics

(vii) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 oplus 119875 997891997888rarr 119872 119865 oplus (119875 ⊲ Forget (119862)) (16)

(viii) Restart

Precondition

T

119872 119865 oplus 119875 997891997888rarr 0 119865 oplus 119875 (17)

(ix) Fail

Precondition

(1) 119872 ⊨ not119862

(2) 119872 contains no decision literals

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to ◻

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (18)

The initial state is 119872 119865 oplus 1198750where 119875

0is the certificate

that defines all initial clauses When a final state ⟨Fail⟩ oplus 119875

is reached 119875 is the obtained certificate for the unsatisfiabilityjudgment

The major differences between CDPLL(T) and DPLL(T)are as follows (1) Certificate generation is added Notice themodifications to rules T-Backjump T-Learn T-Forget andT-Fail (2) In order to generate well-formed certificates theoriginal T-Learn rule is split into 2 cases one for the learningof resolvents (corresponds to clause learning) and the otherfor the learning of theory lemmas (corresponds to deductionsinT)The first three rules are unchanged except the certificatepart

42 Implementation of CDPLL(T) In a real implementationof CDPLL(T) additional information need to be collectedto construct the certificate items In this section we discussthese details

Lemma 9 For any reachable state 119872 119865 oplus 119875 in CDPLL(T)and any clause 119862 isin 119865 there exists a clause item 119863 isin lfloor119875rfloor suchthat ⟦119863⟧ = 119862

Proof We prove that by induction Initially this propertyholds because all clauses in 119865

0are initial clauses and 119875

0

consists of all initial clause items (by definition) At each validtransition step the clause set 119865 and the certificate 119875 may bemodified However each rule ensures that whenever a clause119862 is added to 119865 there is always a clause item 119863 such that⟦119863⟧ = 119862 added to 119875 for example in the rule T-Learn(I)Furthermore whenever a clause item is removed from 119875

the corresponding clause is removed in 119865 for example in therule T-Forget Thus the property holds on all valid trace ofCDPLL(T)

421 Record Reasons For any literal 119897 in 119872 it is either addedby the Decide rule or forced by a clause or theory lemmaFor the latter case we need to record a clause item which isa deduction of the lemma and from which the literal 119897 canbe enforced We call this item the reason for literal 119897 beingin 119872

Only rules UnitPropagate TheoryPropagate and T-Backjump can enforce new literal to 119872 We discuss in thefollowing the reasons recorded by these three rules

(i) UnitPropagate the reason is a certificate item 119863 suchthat ⟦119863⟧ = 119862 or 119897 isin 119865 By Lemma 9 such item alwaysexists

(ii) TheoryPropagate assume that 119872 = 1198971

and 1198972

and sdot sdot sdot and 119897119898

and119872⊨T119897 then the reason is a certificate item119863 suchthat ⟦119863⟧ = not119897

1or not1198972

or not sdot sdot sdot or not119897119898

or 119897(iii) T-Backjump the reason for 119897

1015840 is a certificate item119863 such that ⟦119863⟧ = 119862

1015840or 1198971015840 We will show in the

following that such certificate item always exists whenT-Backjump is applicable

For convenience a subscript is used to denote the reason forexample 119897

(119863)denotes the literal 119897 asserted by the reason ⟦119863⟧

422 Learn Clauses In rules T-Backjump and Fail oncethere is a conflict a backjump clause needs be learned Fur-thermore if there are some branches that have not beenexplored T-Backjump is applied otherwise Fail is appliedEach clause learning procedure introduces some new certifi-cate items

We use implication graph to analyze the clause learningprocess Remember that in DPLL all literals are propositionalatoms the implication graph is simply a direct acyclic graph(DAG) where each edge corresponds to a deduction step InCDPLL(T) the implication graph has two kinds of deduc-tions propositional deduction (which is similar to DPLL)and theory deduction Each deduction step in the implicationgraph is labelled with a reason 119862This graph can be viewed asa generalized implication graph Without causing ambiguitywe still call it an implication graph

In the implication graph of DPLL each cut containing theconflict literals corresponds to a learned clause (by resolvingall the associated clauses of edges in this cut) [4] No matterwhich criteria (eg 1-UIP) is used the rule for clause learn-ing is applicable For CDPLL(T) when considering the gen-eralized implication graphs the clause learning rule is alsoapplicable

There are two situations where T-Backjump can be ap-plied Given a state 119872 119865 oplus 119875 the first situation is thatthere exists a clause 119862 isin 119865 which is falsified by 119872 that is⊭ 119872 119862 Then 119862 is called the conflict clause As in the rulesome backjump clause 119862

1015840or 1198971015840 should be learned by analyzing

the conflict Usually1198621015840or1198971015840 is a resolvent of a linear resolution

A certificate item with resolution type will be learned

Journal of Applied Mathematics 7

not1198971

1198971

1198970119901

1198622

1198620

1198623

Cut1 Cut2

Figure 2 Example of a generalized implication graph

The other situation is that 119872 itself becomes T-inconsist-ent Then a subset 119897

1 1198972 119897119898

of 119872 is unsatisfiable that is⊨Tnot1198971

or not1198972

or sdot sdot sdot or not119897119898 This is also a conflict clause It may

not belong to 119865 but it can definitely be learned from 119865 alongwith some theory lemmas The fact that 119897

1 119897119898leads to

conflict is represented in the generalized implication graphFurthermore the clause 119862

1015840or 1198971015840 that is required by the rule T-

Backjump can also be learned by generalized conflict clauseanalysis

In both situations the backjump clauses are learned byclause learning The certificate for unsatisfiability is actuallya well-organized collection of these clause items If Fail isapplicable no literal in119872 is decision variable then the learntclause is the empty clause which shows 119865⊨T◻

Example 10 A generalized implication graph is shown inFigure 2 where the literals are as follows 119897

0 119909 = 119910 119897

1

119891(119909) = 119891(119910) and 119901 is a propositional literal Clauses are asfollows

1198620

not1198970

or not1198971

= not (119909 = 119910) or (119891 (119909) = 119891 (119910))

1198621

1198970

or 119901 = 119909 = 119910 or 119901

1198622

1198970

or not119901 = 119909 = 119910 or not119901

(19)

Assume the current assignment 119872 = 119901 1198970 not1198971 Then

there is a conflict between not1198971and 1198971 In all there are 3

deductions in the implication graph 119901 rarr 1198970forced by 119862

2

1198970

rarr not1198971by 1198620and 1198970

rarr 1198971by the theory lemma 119862

3

Among those reasons solid lines correspond to existingclauses in119865 while dashed lines correspond to theory lemmas1198623is actually a theory lemma 119897

0rarr 1198971(119909 = 119910 rarr 119891(119909) =

119891(119910))In the clause learning procedure we start from the

conflict (not(1198971

and not1198971) = not119897

1or 1198971) and trace back (apply a

sequence of resolutions) If the 1st UIP schema [4] is usedit is possible to learn not119901 or not119897

0(corresponds to cut

1and

cut2 resp) The resolution steps for learning not119901 are shown

in Figure 3 Actually not119901 is the resolvent of 1198623 1198620 1198622 which

are exactly the reasons labeled on the edges from the conflictto the cut Among those 119862

0 1198622are normal clauses and 119862

3is

a theory lemmaBased on the above discussions the related rules can be

interpreted more precisely as follows

1198623 not1198970 or 1198971not1198970

1198971

1198970not119901

1198622 not119901 or 11989701198620 not1198970 or not1198971

Figure 3 Resolution steps for learning not119901

(i) TheoryPropagate

119872 119865 oplus 119875 997891997888rarr 119872 119897(119862)

119865 oplus 119875 (20)

The precondition requires that 119872⊨T119897 Let 1198721015840 be a subset of

119872 which entails 119897 (possibly 1198721015840

= 119872) Let 119862 = not1198721015840

or 119897 then⊨T119862 Furthermore 119872 119862⊨T119897 which means 119862 is a unit clauseunder the assignment 119872 Thus 119862 is the reason of 119897

(ii) UnitPropagate

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897(119862or119897)

119865 119862 or 119897 oplus 119875 (21)

The precondition requires that 119872 ⊨ not119862 thus 119862 or 119897 is a unitclause under 119872 So the reason of 119897 is 119862 or 119897

(iii) T-Backjump

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840

(1198621015840or1198971015840)

119865 119862 oplus (119875 ⊲ Res (119877))

(22)

The precondition requires that there exits some clause 1198621015840

or 1198971015840

such that 119865 119862⊨T1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840 The clause 1198621015840

or 1198971015840 is

then used as the reason for 1198971015840 As the clause learning is always

applicable [9] this clause always exists If the assignment119872 119897

d119873 is T-consistent then there must be some conflict

clause 119862 isin 119865 As explained above the clause learning willthen be applied and the learned clause will be 119862

1015840or 1198971015840 On the

other hand when 119872 119897d119873 is T-inconsistent clause learning

is also applicable in the generalized implication graph andgenerates a candidate 119862

1015840or 1198971015840 In both cases 119862

1015840or 1198971015840 can be

learned by applying linear regular resolutions on a clauseset 119877 [4] Moreover such 119877 is the union of a subset of reasonsin 119872 119897

d119873 and a group of theory lemmas

(iv) Fail

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (23)

The Fail rule is similar to T-Backjump except that the learnedclause is the empty clause

43 Properties of CDPLL(T) The soundness and complete-ness of CDPLL(T) are proven in this subsection

Lemma11 Awell-formed certificatemodified byT-BackjumpT-Learn(I) T-Learn(II) Fail or T-Forget is still wellformed

Proof Each of the 4 rules appends a new item to the cer-tificate Assume that the certificate 119875 is modified to 119875

1015840=

119875 ⊲ 119868 where 119868 is the appended certificate item The well-formed property of 119875

1015840 is checked by case studying the rulesapplied

8 Journal of Applied Mathematics

(i) For T-Backjump a certificate item of resolution typeis appended According to the precondition thecertificate items referred by 119868 are already defined inlfloor119875rfloor So 119875

1015840 is still well formed

(ii) For T-Learn(I) similar to the previous case The de-pendency of appended certificate item is satisfied

(iii) For T-Learn(II) a theory lemma item 119862 is appendedIt is required that ⊨T119862 Thus 119875

1015840 is still well formed

(iv) For Fail similar to the T-Backjump case

(v) For T-Forget the rule requires that the forgottenclause is in the clause set By Lemma 9 we know thatthere is a certificate item in 119875 which defines 119862 Thusthe well formedness is ensured

With the help of this lemma it can be proven thatCDPLL(T) procedure always generates well-formed certifi-cates

Lemma 12 Given any reachable state 119872 119865 oplus 119875 in anyCDPLL(T) procedure 119875 is a well-formed certificate

Proof First of all the initial certificate 1198750is well formed

because it only contains initial items Secondly the certificateis only modified in T-Backjump T-Learn(I) T-Learn(II)Fail and T-Forget By Lemma 11 these rules will preservewell formedness So following any path of CDPLL(T) willalways generate a well-formed certificate That completes theproof

Theorem 13 (soundness) Given a clause set 1198650 for any

certificate 119875 if 119872119909

119865119909

oplus 119875119909is reachable in a CDPLL(T)

procedure then 119872119909

119865119909is also reachable in a DPLL(T)

procedure Specially if ⟨Fail⟩ oplus 119875 is reachable in CDPLL(T)then ⟨Fail⟩ is also reachable in DPLL(T)

Proof For each transition step in CDPLL(T)

119872 119865 oplus 119875997888rarrCDPLL(T)1198721015840

1198651015840

oplus 1198751015840 (24)

it holds that

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (25)

So given a CDPLL(T) trace a trace in DPLL(T) can beobtained by removing the certificate part If 119872

119909 119865119909

oplus 119875119909

is reachable in CDPLL(T) so is 119872119909

119865119909in DPLL(T)

Theorem 14 (completeness) Given a clause set 1198650 if the state

119872119909

119865119909is reachable in a DPLL(T) procedure then there

must be a certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable

in a DPLL(T) procedure Furthermore if ⟨Fail⟩ is reachablein a DPLL(T) procedure then ⟨Fail⟩ oplus 119875

119909is reachable in a

CDPLL(T) and ◻ isin lfloor119875119909rfloor

Proof For rules Decide TheoryPropagate UnitPropagate T-Learn(II) T-Forget and Restart the preconditions are equalto those in CDPLL(T) So if there is a transition in DPLL(T)

labelled with one of these rules there could also be the sametransition in CDPLL(T)

For other rules T-Backjump T-Learn(I) and Fail weneed to prove that if 119872

119909 119865119909is reachable then there is some

certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable We prove

that by induction Initially this condition holds because theinitial state 119872

0 1198650corresponds to 119872

0 1198650

oplus 1198750 Inductively

if

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (26)

is a possible transition in DPLL(T) and there is some 119875 suchthat 119872 119865 oplus 119875 is reachable in CDPLL(T)

(i) If it is labelled with T-Learn(I) because the learnedclause is from clause learning and clause learningcorresponds to linear regular resolution It is alwayspossible that necessary theory lemmas in the impli-cation graph are learned at first by Learn(II) and getto a state 119872 119865 oplus 119875

10158401015840 on which the preconditions ofLearn(I) are satisfied Then 119872

1015840 1198651015840

oplus 1198751015840 is reached

(ii) If it is labelled with T-Backjump or Fail the situationis similar

Our approach can be well adapted to any DPLL(T)-basedSMT solver It is sound and complete regardless of the theorylearning scheme used or the order of the decision procedureIt works well as long as clause learning is based on thegeneralized implication graph

44 Examples A couple of examples are discussed in thissubsection The first example is given as a CNF formulaon propositional variables In this case the SMT problemreduces to an SAT problem No theory deduction is involvedin this example so we can get a general idea of the structureof certificates

Example 1 Consider the following clause set

1198621

= not1198971

or not1198972

or not1198973

1198622

= not1198971

or not1198972

or 1198973

1198623

= not1198971

or 1198972

1198624

= 1198971

or not1198972

1198625

= 1198971

or 1198972

(27)

Let 1198650

= 1198621 1198622 1198623 1198624 1198625 assume that 119875

0defines

everything in 1198650as initial clause then a possible CDPLL(T)

procedure is shown in Table 2 where ldquosimrdquo means this field isthe same as that in the previous row

In step 4 1198622

= not1198971

or not1198972

or 1198973is falsified By analysing

the implication graph a clause is learned from 1198621 1198622 1198623

Amonglsquothe referred clauses1198622is a falsified clause and119862

1 1198623

are in the reasons In step 7 1198624is falsified but there is no

decision variable By analysing the implication graph we canfind a resolution from 119862

4 1198625 1198626to the empty clause Among

the referred clauses 1198624is a falsified clause and 119862

5 1198626are in

the reasons

Journal of Applied Mathematics 9

Table 2 The CDPLL(T) procedure for Example 1

No 119872 119865 oplusP Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 119897

d1

10038171003817100381710038171003817sim oplus sim 0 Decide

3 119897d1 1198972

10038171003817100381710038171003817sim oplus sim 119897

2997891rarr 1198623 UnitPropagate

4 119897d1 1198972 not1198973

10038171003817100381710038171003817sim oplus sim

1198972

997891rarr 1198623

not1198973

997891rarr 1198621

UnitPropagate

now 1198622is falsified

5 not1198971

1003817100381710038171003817

1198650

1198626

= not1198971

oplus [1198750

Res1198621 1198622 1198623 = 119862

6

] not1198971

997891rarr 1198626

1198626

= not1198971is learned

T-Backjump

6 not1198971 not1198972

1003817100381710038171003817 sim oplus simnot1198971

997891rarr 1198626

not1198972

997891rarr 1198624

UnitPropagate

7 ⟨Fail⟩ sim oplus

[[[

[

1198750

Res1198621 1198622 1198623 = 119862

6

Res1198624 1198625 1198626 = ◻

]]]

]

0 Fail

Example 2 Consider the background theory TEUF and aclause set 119865 containing the following

(1198621) 119886 = 119887 or 119892 (119886) = 119892 (119887)

(1198622) ℎ (119886) = ℎ (119888) or 119901

(1198623) 119892 (119886) = 119892 (119887) or not119901

(1198624) ℎ (119886) = ℎ (119888) or 119886 = 119888

(1198625) 119891 (119886) = 119891 (119887)

(1198626) 119887 = 119888

(28)

Its boolean structure is

1198621

= 1198971

or not1198975

1198622

= not1198976

or 119901

1198623

= 1198975

or not119901

1198624

= 1198976

or 1198972

1198625

= not1198974

1198626

= 1198973

(29)

where

1198971 119886 = 119887

1198972 119886 = 119888

1198973 119887 = 119888

1198974 119891 (119886) = 119891 (119887)

1198975 119892 (119886) = 119892 (119887)

1198976 ℎ (119886) = ℎ (119888)

(30)

A possible CDPLL(T) procedure is shown in Table 3where the referred certificates are as follows

1198751

= [1198750

Lemma (not1198971

or 1198974) = 1198627

]

1198752

= [

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

]

]

1198753

=

[[[

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

Res 1198628 1198627 1198624 1198626 1198622 1198623 1198621 = ◻

]]]

]

(31)

In step 4 119872 becomes T-inconsistent The generalizedimplication graph is shown in Figure 4 A theory lemma119862

7=

not1198971

or 1198974is learned (instantiated the monotonicity property of

the equality) firstly then the backjump rule is applicable Insteps 11 and 12 119872 becomes T-inconsistent again A theorylemma 119862

8= 1198972

and 1198973

rarr 1198971is then learned (transitivity

of equality) and then clause learning is performed whichlearns the empty clause The implication graph is shown inFigure 5

5 The Certificate Checking

Lemma 15 For any initial clause set 1198650 given a well-formed

certificate 119875 and 0 le 119894 le |119875| if 119875[119894] is a certificate item thatdefines a clause then 119865

0⊨T⟦119875[119894]⟧

Proof We prove that by induction The base case is easy toprove because for any initial item 119875[119894] that defines a clause119875[119894] isin 119865

0 Therefore it is trivial that 119865

0⊨ ⟦119875[119894]⟧ Thus

1198650⊨T⟦119875[119894]⟧

(i) For resolution certificate items assume that 119875[119894] =

Res1198621 1198622 119862

119899 By definition all 119862

119894are all

defined in lfloor119875rfloor119894minus1

Then by the inductive hypothesis

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

4 Journal of Applied Mathematics

(v) T-Learn

Precondition(1) each atom of 119862 occurs in 119865 or in 119872(2) 119865⊨T119862

119872 119865 997891997888rarr 119872 119865 119862 (5)

(vi) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 997891997888rarr 119872 119865 (6)

(vii) Restart

Precondition

T

119872 119865 997891997888rarr 0 119865 (7)

(viii) Fail

Precondition(1) 119872 ⊨ not119862(2) 119872 contains no decision literals

119872 119865 119862 997891997888rarr ⟨Fail⟩ (8)

In DPLL(T) algorithm there is an assumption that thereexists a T-solver that can check the consistency of conjunc-tions of literals given in T This algorithm will terminateunder weak assumption and is proven sound and complete[9]

3 The Certificate Format

If a formula is satisfiable the certificate is simply an inter-pretation under which the formula evaluates to true On theother hand if it is unsatisfiable the certificate could be muchmore complicated It should be proven that ldquoall attempts tofind amodel failedrdquoMore precisely each branch of the searchtreemust be tried Technically speaking this closed search treecan be presented as a refutation procedure which containssequences of resolution procedures that produce the emptyclause [12] It is imaginable that if the search tree is quite largeso is the set of refutation clauses

Remember that in first order logic (FOL) constants areconsidered as nullary functions Then a term is either avariable or a function with arguments which are also termsFor example 119909 119891(119909 119910) and 120601(119904 120595(119905)) are valid terms Anatom (or atomic formula) is a predicate with arguments beingterms Note that nullary predicates are actually propositionalvariables

Definition 1 (T-atom T-literal) Given a background theoryT a T-atom is an atom whose functions and predicates are inthe signature of T A T-literal is the positive or negative formof a T-atom

For example given T the theory of Equality with Unin-terpreted Functions if 119901 is a propositional variable then 119901119909 = 119891(119910) are T-atoms and 119901 not119901 119909 = 119891(119910) and 119909 = 119891(119910) areT-literals

A proof rule is an inference rule which has several T-literals as premises and one T-literal as the conclusion Forproof rules whose conclusion is conjunction of 119899 T-literalswe can split it to 119899 proof rules and one for each) For example119909 = 119910 implies that 119891(119909) = 119891(119910) is a proof rule (themonotonicity of equality)This rule holds for all T-terms 119909 119910

and all function 119891 Each proof rule can be instantiated to atheory lemma For instance if 119904 119905 are two specific T-termsand 119892 is a specific function then (119904 = 119905) rarr (119892(119904) = 119892(119905)) is atheory lemma

Definition 2 (clause item) A clause item is one of the follow-ing three forms

(i) Init 1198971

or sdot sdot sdot or 119897119899 an initial clause 119897

1or sdot sdot sdot or 119897

119899 where all

119897119894are T-literals

(ii) Res 1198621 119862

119899 a clause obtained from a linear

resolution of1198621 119862

119899 where all119862

119894are clauses items

(iii) Lemma 1198971

and sdot sdot sdot and 119897119899

rarr 1198971015840 a theory lemma where all

119897119894and 1198971015840 are T-literals The defined clause is not119897

1or sdot sdot sdot or

not119897119899

or 1198971015840

A resolution chain 1198621 1198622 119862

119899is called a linear reg-

ular resolution chain if there exists a sequence of clauses1198631 1198632 119863

119899minus1such that (1) 119863

1is the resolvent of 119862

1and

1198622 (2) for 2 le 119894 lt 119899 119863

119894is the resolvent of 119863

119894minus1and 119862

119894+1 (3)

all resolutions are applied on different T-atomsA clause item gives the syntax form for a clause For

each clause item 119862 if it is well defined the concrete formof 119862 (given as disjunction of T-literals) can be calculatedIt is called a concrete clause denoted by ⟦119862⟧ We do notdistinguish 119862 and ⟦119862⟧ when no ambiguity is caused

Definition 3 (certificate item) A certificate item is either ofthe following

(i) Define ClauseItem

(ii) Forget ClauseItem

Definition 4 (certificate) A certificate 119875 with respect to anSMT instance is a sequence of certificate items The size of 119875denoted by |119875| is the number of certificate items containedin 119875 119875[119894] is the 119894th element in 119875 for 119894 = 1 2 |119875|

Definition 5 (context) Given a certificate119875 and a number 0 le

119894 le |119875| the context with respect to 119894 denoted by lfloor119875rfloor is a setof clause items such that

lfloor119875rfloor119894

=

0 if 119894 = 0

lfloor119875rfloor119894minus1

cup 119862 if 119894 gt 0 and 119875 [119894 minus 1] is Define119862

lfloor119875rfloor119894minus1

119862 if 119894 gt 0 and 119875 [119894 minus 1] is Forget119862

(9)

Specially denote lfloor119875rfloor|119875|

by lfloor119875rfloor

Journal of Applied Mathematics 5

Table 1 A certificate example

Clause item ⟦119862⟧ Lemma1198621Define Init not119897

1or not1198972

not1198971

or not1198972

1198622Define Init 119897

11198971

1198623Define Lemma 119897

1rarr 1198972

not1198971

or 1198972

The property of ldquo=rdquo1198624Define Res 119862

1 1198623 1198622

◻ The empty clause

Example 6 Assume that 1198971

119909 = 119910 1198972

119891(119909) = 119891(119910) and1198621

not1198971

or not1198972 1198622

1198971 A certificate for the unsatisfiability of

the clause set 1198621 1198622 is presented in Table 1 In this example

1198621 1198622 is intuitively unsatisfiable Note that 119862

2entails 119897

1

which implies 1198972in TEUF while 119862

1 1198622

⊨ not1198972

Definition 7 (well-formed certificate) A certificate 119875 is wellformed if for all 1 le 119894 le |119875|

(i) if 119875[119894] is a Define item then one of the followingconditions hold(a) it is of Init type(b) it is of Res 119862

1 119862

119899 type where 119862

119896isin lfloor119875rfloor

119894minus1

for all 1 le 119896 le 119899 and there should be a linearregular resolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧

to 119875[119894](c) it is of Lemma type and the concrete clause is

valid in theory T that is ⊨T⟦119875[119894]⟧(ii) otherwise 119875[119894] is a Forget item and the referred clause

must be defined in lfloor119875rfloor119894minus1

4 Certificate Generation with DPLL(T)

The certificate generation algorithm is described in this sec-tion We first describe the abstract rule and implementationissues then prove some important properties and finally givea couple of examples

41 The CDPLL(T) Algorithm We describe here a certifi-cate generation procedure that can be well adapted to theDPLL(T) algorithm The extension of DPLL(T) with certifi-cate generation is called CDPLL(T)

Definition 8 (certificate refinement) The certificate obtainedby appending a certificate item 119862 to the end of the certificate119875 is denoted by 119875 ⊲ 119862

We use a transition system to model the CDPLL(T)algorithm Each state is represented as a 3-tuple ldquo119872 119865 oplus 119875rdquowhere119872 is the literal stack119865 the clause set to be satisfied and119875 the current certificateThe transition rules inCDPLL(T) arefollows

(i) Decide

Precondition(1) 119897 or not119897 occurs in a clause of 119865(2) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897d

119865 oplus 119875 (10)

(ii) UnitPropagate

Precondition

(1) 119872 ⊨ not119862(2) 119897 is undefined in 119872

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897 119865 119862 or 119897 oplus 119875 (11)

(iii) TheoryPropagate

Precondition

(1) 119872⊨T119897(2) 119897 or not119897 occurs in 119865(3) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897 119865 oplus 119875 (12)

(iv) T-Backjump

Precondition

(1) 119872 119897d119873 ⊨ not119862

(2) there exists a clause 1198621015840or 1198971015840 such that (a) 119865 119862⊨T119862

1015840or 1198971015840

and 119872 ⊨ not1198621015840 (b) 119897

1015840 is undefined in 119872 and (c) 1198971015840 or

not1198971015840 occurs in 119865 or in 119872 119897

d119873

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

1015840or 1198971015840

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840 119865 119862 oplus (119875 ⊲ Res (119877)) (13)

(v) T-Learn(I) this rule models the clause learning proce-dure

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) 119865 ⊨ 119862

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Res (119877)) (14)

(vi) T-Learn(II) this rule models the learning of theorylemma

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) ⊨T119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Lemma (119862)) (15)

6 Journal of Applied Mathematics

(vii) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 oplus 119875 997891997888rarr 119872 119865 oplus (119875 ⊲ Forget (119862)) (16)

(viii) Restart

Precondition

T

119872 119865 oplus 119875 997891997888rarr 0 119865 oplus 119875 (17)

(ix) Fail

Precondition

(1) 119872 ⊨ not119862

(2) 119872 contains no decision literals

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to ◻

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (18)

The initial state is 119872 119865 oplus 1198750where 119875

0is the certificate

that defines all initial clauses When a final state ⟨Fail⟩ oplus 119875

is reached 119875 is the obtained certificate for the unsatisfiabilityjudgment

The major differences between CDPLL(T) and DPLL(T)are as follows (1) Certificate generation is added Notice themodifications to rules T-Backjump T-Learn T-Forget andT-Fail (2) In order to generate well-formed certificates theoriginal T-Learn rule is split into 2 cases one for the learningof resolvents (corresponds to clause learning) and the otherfor the learning of theory lemmas (corresponds to deductionsinT)The first three rules are unchanged except the certificatepart

42 Implementation of CDPLL(T) In a real implementationof CDPLL(T) additional information need to be collectedto construct the certificate items In this section we discussthese details

Lemma 9 For any reachable state 119872 119865 oplus 119875 in CDPLL(T)and any clause 119862 isin 119865 there exists a clause item 119863 isin lfloor119875rfloor suchthat ⟦119863⟧ = 119862

Proof We prove that by induction Initially this propertyholds because all clauses in 119865

0are initial clauses and 119875

0

consists of all initial clause items (by definition) At each validtransition step the clause set 119865 and the certificate 119875 may bemodified However each rule ensures that whenever a clause119862 is added to 119865 there is always a clause item 119863 such that⟦119863⟧ = 119862 added to 119875 for example in the rule T-Learn(I)Furthermore whenever a clause item is removed from 119875

the corresponding clause is removed in 119865 for example in therule T-Forget Thus the property holds on all valid trace ofCDPLL(T)

421 Record Reasons For any literal 119897 in 119872 it is either addedby the Decide rule or forced by a clause or theory lemmaFor the latter case we need to record a clause item which isa deduction of the lemma and from which the literal 119897 canbe enforced We call this item the reason for literal 119897 beingin 119872

Only rules UnitPropagate TheoryPropagate and T-Backjump can enforce new literal to 119872 We discuss in thefollowing the reasons recorded by these three rules

(i) UnitPropagate the reason is a certificate item 119863 suchthat ⟦119863⟧ = 119862 or 119897 isin 119865 By Lemma 9 such item alwaysexists

(ii) TheoryPropagate assume that 119872 = 1198971

and 1198972

and sdot sdot sdot and 119897119898

and119872⊨T119897 then the reason is a certificate item119863 suchthat ⟦119863⟧ = not119897

1or not1198972

or not sdot sdot sdot or not119897119898

or 119897(iii) T-Backjump the reason for 119897

1015840 is a certificate item119863 such that ⟦119863⟧ = 119862

1015840or 1198971015840 We will show in the

following that such certificate item always exists whenT-Backjump is applicable

For convenience a subscript is used to denote the reason forexample 119897

(119863)denotes the literal 119897 asserted by the reason ⟦119863⟧

422 Learn Clauses In rules T-Backjump and Fail oncethere is a conflict a backjump clause needs be learned Fur-thermore if there are some branches that have not beenexplored T-Backjump is applied otherwise Fail is appliedEach clause learning procedure introduces some new certifi-cate items

We use implication graph to analyze the clause learningprocess Remember that in DPLL all literals are propositionalatoms the implication graph is simply a direct acyclic graph(DAG) where each edge corresponds to a deduction step InCDPLL(T) the implication graph has two kinds of deduc-tions propositional deduction (which is similar to DPLL)and theory deduction Each deduction step in the implicationgraph is labelled with a reason 119862This graph can be viewed asa generalized implication graph Without causing ambiguitywe still call it an implication graph

In the implication graph of DPLL each cut containing theconflict literals corresponds to a learned clause (by resolvingall the associated clauses of edges in this cut) [4] No matterwhich criteria (eg 1-UIP) is used the rule for clause learn-ing is applicable For CDPLL(T) when considering the gen-eralized implication graphs the clause learning rule is alsoapplicable

There are two situations where T-Backjump can be ap-plied Given a state 119872 119865 oplus 119875 the first situation is thatthere exists a clause 119862 isin 119865 which is falsified by 119872 that is⊭ 119872 119862 Then 119862 is called the conflict clause As in the rulesome backjump clause 119862

1015840or 1198971015840 should be learned by analyzing

the conflict Usually1198621015840or1198971015840 is a resolvent of a linear resolution

A certificate item with resolution type will be learned

Journal of Applied Mathematics 7

not1198971

1198971

1198970119901

1198622

1198620

1198623

Cut1 Cut2

Figure 2 Example of a generalized implication graph

The other situation is that 119872 itself becomes T-inconsist-ent Then a subset 119897

1 1198972 119897119898

of 119872 is unsatisfiable that is⊨Tnot1198971

or not1198972

or sdot sdot sdot or not119897119898 This is also a conflict clause It may

not belong to 119865 but it can definitely be learned from 119865 alongwith some theory lemmas The fact that 119897

1 119897119898leads to

conflict is represented in the generalized implication graphFurthermore the clause 119862

1015840or 1198971015840 that is required by the rule T-

Backjump can also be learned by generalized conflict clauseanalysis

In both situations the backjump clauses are learned byclause learning The certificate for unsatisfiability is actuallya well-organized collection of these clause items If Fail isapplicable no literal in119872 is decision variable then the learntclause is the empty clause which shows 119865⊨T◻

Example 10 A generalized implication graph is shown inFigure 2 where the literals are as follows 119897

0 119909 = 119910 119897

1

119891(119909) = 119891(119910) and 119901 is a propositional literal Clauses are asfollows

1198620

not1198970

or not1198971

= not (119909 = 119910) or (119891 (119909) = 119891 (119910))

1198621

1198970

or 119901 = 119909 = 119910 or 119901

1198622

1198970

or not119901 = 119909 = 119910 or not119901

(19)

Assume the current assignment 119872 = 119901 1198970 not1198971 Then

there is a conflict between not1198971and 1198971 In all there are 3

deductions in the implication graph 119901 rarr 1198970forced by 119862

2

1198970

rarr not1198971by 1198620and 1198970

rarr 1198971by the theory lemma 119862

3

Among those reasons solid lines correspond to existingclauses in119865 while dashed lines correspond to theory lemmas1198623is actually a theory lemma 119897

0rarr 1198971(119909 = 119910 rarr 119891(119909) =

119891(119910))In the clause learning procedure we start from the

conflict (not(1198971

and not1198971) = not119897

1or 1198971) and trace back (apply a

sequence of resolutions) If the 1st UIP schema [4] is usedit is possible to learn not119901 or not119897

0(corresponds to cut

1and

cut2 resp) The resolution steps for learning not119901 are shown

in Figure 3 Actually not119901 is the resolvent of 1198623 1198620 1198622 which

are exactly the reasons labeled on the edges from the conflictto the cut Among those 119862

0 1198622are normal clauses and 119862

3is

a theory lemmaBased on the above discussions the related rules can be

interpreted more precisely as follows

1198623 not1198970 or 1198971not1198970

1198971

1198970not119901

1198622 not119901 or 11989701198620 not1198970 or not1198971

Figure 3 Resolution steps for learning not119901

(i) TheoryPropagate

119872 119865 oplus 119875 997891997888rarr 119872 119897(119862)

119865 oplus 119875 (20)

The precondition requires that 119872⊨T119897 Let 1198721015840 be a subset of

119872 which entails 119897 (possibly 1198721015840

= 119872) Let 119862 = not1198721015840

or 119897 then⊨T119862 Furthermore 119872 119862⊨T119897 which means 119862 is a unit clauseunder the assignment 119872 Thus 119862 is the reason of 119897

(ii) UnitPropagate

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897(119862or119897)

119865 119862 or 119897 oplus 119875 (21)

The precondition requires that 119872 ⊨ not119862 thus 119862 or 119897 is a unitclause under 119872 So the reason of 119897 is 119862 or 119897

(iii) T-Backjump

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840

(1198621015840or1198971015840)

119865 119862 oplus (119875 ⊲ Res (119877))

(22)

The precondition requires that there exits some clause 1198621015840

or 1198971015840

such that 119865 119862⊨T1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840 The clause 1198621015840

or 1198971015840 is

then used as the reason for 1198971015840 As the clause learning is always

applicable [9] this clause always exists If the assignment119872 119897

d119873 is T-consistent then there must be some conflict

clause 119862 isin 119865 As explained above the clause learning willthen be applied and the learned clause will be 119862

1015840or 1198971015840 On the

other hand when 119872 119897d119873 is T-inconsistent clause learning

is also applicable in the generalized implication graph andgenerates a candidate 119862

1015840or 1198971015840 In both cases 119862

1015840or 1198971015840 can be

learned by applying linear regular resolutions on a clauseset 119877 [4] Moreover such 119877 is the union of a subset of reasonsin 119872 119897

d119873 and a group of theory lemmas

(iv) Fail

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (23)

The Fail rule is similar to T-Backjump except that the learnedclause is the empty clause

43 Properties of CDPLL(T) The soundness and complete-ness of CDPLL(T) are proven in this subsection

Lemma11 Awell-formed certificatemodified byT-BackjumpT-Learn(I) T-Learn(II) Fail or T-Forget is still wellformed

Proof Each of the 4 rules appends a new item to the cer-tificate Assume that the certificate 119875 is modified to 119875

1015840=

119875 ⊲ 119868 where 119868 is the appended certificate item The well-formed property of 119875

1015840 is checked by case studying the rulesapplied

8 Journal of Applied Mathematics

(i) For T-Backjump a certificate item of resolution typeis appended According to the precondition thecertificate items referred by 119868 are already defined inlfloor119875rfloor So 119875

1015840 is still well formed

(ii) For T-Learn(I) similar to the previous case The de-pendency of appended certificate item is satisfied

(iii) For T-Learn(II) a theory lemma item 119862 is appendedIt is required that ⊨T119862 Thus 119875

1015840 is still well formed

(iv) For Fail similar to the T-Backjump case

(v) For T-Forget the rule requires that the forgottenclause is in the clause set By Lemma 9 we know thatthere is a certificate item in 119875 which defines 119862 Thusthe well formedness is ensured

With the help of this lemma it can be proven thatCDPLL(T) procedure always generates well-formed certifi-cates

Lemma 12 Given any reachable state 119872 119865 oplus 119875 in anyCDPLL(T) procedure 119875 is a well-formed certificate

Proof First of all the initial certificate 1198750is well formed

because it only contains initial items Secondly the certificateis only modified in T-Backjump T-Learn(I) T-Learn(II)Fail and T-Forget By Lemma 11 these rules will preservewell formedness So following any path of CDPLL(T) willalways generate a well-formed certificate That completes theproof

Theorem 13 (soundness) Given a clause set 1198650 for any

certificate 119875 if 119872119909

119865119909

oplus 119875119909is reachable in a CDPLL(T)

procedure then 119872119909

119865119909is also reachable in a DPLL(T)

procedure Specially if ⟨Fail⟩ oplus 119875 is reachable in CDPLL(T)then ⟨Fail⟩ is also reachable in DPLL(T)

Proof For each transition step in CDPLL(T)

119872 119865 oplus 119875997888rarrCDPLL(T)1198721015840

1198651015840

oplus 1198751015840 (24)

it holds that

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (25)

So given a CDPLL(T) trace a trace in DPLL(T) can beobtained by removing the certificate part If 119872

119909 119865119909

oplus 119875119909

is reachable in CDPLL(T) so is 119872119909

119865119909in DPLL(T)

Theorem 14 (completeness) Given a clause set 1198650 if the state

119872119909

119865119909is reachable in a DPLL(T) procedure then there

must be a certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable

in a DPLL(T) procedure Furthermore if ⟨Fail⟩ is reachablein a DPLL(T) procedure then ⟨Fail⟩ oplus 119875

119909is reachable in a

CDPLL(T) and ◻ isin lfloor119875119909rfloor

Proof For rules Decide TheoryPropagate UnitPropagate T-Learn(II) T-Forget and Restart the preconditions are equalto those in CDPLL(T) So if there is a transition in DPLL(T)

labelled with one of these rules there could also be the sametransition in CDPLL(T)

For other rules T-Backjump T-Learn(I) and Fail weneed to prove that if 119872

119909 119865119909is reachable then there is some

certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable We prove

that by induction Initially this condition holds because theinitial state 119872

0 1198650corresponds to 119872

0 1198650

oplus 1198750 Inductively

if

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (26)

is a possible transition in DPLL(T) and there is some 119875 suchthat 119872 119865 oplus 119875 is reachable in CDPLL(T)

(i) If it is labelled with T-Learn(I) because the learnedclause is from clause learning and clause learningcorresponds to linear regular resolution It is alwayspossible that necessary theory lemmas in the impli-cation graph are learned at first by Learn(II) and getto a state 119872 119865 oplus 119875

10158401015840 on which the preconditions ofLearn(I) are satisfied Then 119872

1015840 1198651015840

oplus 1198751015840 is reached

(ii) If it is labelled with T-Backjump or Fail the situationis similar

Our approach can be well adapted to any DPLL(T)-basedSMT solver It is sound and complete regardless of the theorylearning scheme used or the order of the decision procedureIt works well as long as clause learning is based on thegeneralized implication graph

44 Examples A couple of examples are discussed in thissubsection The first example is given as a CNF formulaon propositional variables In this case the SMT problemreduces to an SAT problem No theory deduction is involvedin this example so we can get a general idea of the structureof certificates

Example 1 Consider the following clause set

1198621

= not1198971

or not1198972

or not1198973

1198622

= not1198971

or not1198972

or 1198973

1198623

= not1198971

or 1198972

1198624

= 1198971

or not1198972

1198625

= 1198971

or 1198972

(27)

Let 1198650

= 1198621 1198622 1198623 1198624 1198625 assume that 119875

0defines

everything in 1198650as initial clause then a possible CDPLL(T)

procedure is shown in Table 2 where ldquosimrdquo means this field isthe same as that in the previous row

In step 4 1198622

= not1198971

or not1198972

or 1198973is falsified By analysing

the implication graph a clause is learned from 1198621 1198622 1198623

Amonglsquothe referred clauses1198622is a falsified clause and119862

1 1198623

are in the reasons In step 7 1198624is falsified but there is no

decision variable By analysing the implication graph we canfind a resolution from 119862

4 1198625 1198626to the empty clause Among

the referred clauses 1198624is a falsified clause and 119862

5 1198626are in

the reasons

Journal of Applied Mathematics 9

Table 2 The CDPLL(T) procedure for Example 1

No 119872 119865 oplusP Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 119897

d1

10038171003817100381710038171003817sim oplus sim 0 Decide

3 119897d1 1198972

10038171003817100381710038171003817sim oplus sim 119897

2997891rarr 1198623 UnitPropagate

4 119897d1 1198972 not1198973

10038171003817100381710038171003817sim oplus sim

1198972

997891rarr 1198623

not1198973

997891rarr 1198621

UnitPropagate

now 1198622is falsified

5 not1198971

1003817100381710038171003817

1198650

1198626

= not1198971

oplus [1198750

Res1198621 1198622 1198623 = 119862

6

] not1198971

997891rarr 1198626

1198626

= not1198971is learned

T-Backjump

6 not1198971 not1198972

1003817100381710038171003817 sim oplus simnot1198971

997891rarr 1198626

not1198972

997891rarr 1198624

UnitPropagate

7 ⟨Fail⟩ sim oplus

[[[

[

1198750

Res1198621 1198622 1198623 = 119862

6

Res1198624 1198625 1198626 = ◻

]]]

]

0 Fail

Example 2 Consider the background theory TEUF and aclause set 119865 containing the following

(1198621) 119886 = 119887 or 119892 (119886) = 119892 (119887)

(1198622) ℎ (119886) = ℎ (119888) or 119901

(1198623) 119892 (119886) = 119892 (119887) or not119901

(1198624) ℎ (119886) = ℎ (119888) or 119886 = 119888

(1198625) 119891 (119886) = 119891 (119887)

(1198626) 119887 = 119888

(28)

Its boolean structure is

1198621

= 1198971

or not1198975

1198622

= not1198976

or 119901

1198623

= 1198975

or not119901

1198624

= 1198976

or 1198972

1198625

= not1198974

1198626

= 1198973

(29)

where

1198971 119886 = 119887

1198972 119886 = 119888

1198973 119887 = 119888

1198974 119891 (119886) = 119891 (119887)

1198975 119892 (119886) = 119892 (119887)

1198976 ℎ (119886) = ℎ (119888)

(30)

A possible CDPLL(T) procedure is shown in Table 3where the referred certificates are as follows

1198751

= [1198750

Lemma (not1198971

or 1198974) = 1198627

]

1198752

= [

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

]

]

1198753

=

[[[

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

Res 1198628 1198627 1198624 1198626 1198622 1198623 1198621 = ◻

]]]

]

(31)

In step 4 119872 becomes T-inconsistent The generalizedimplication graph is shown in Figure 4 A theory lemma119862

7=

not1198971

or 1198974is learned (instantiated the monotonicity property of

the equality) firstly then the backjump rule is applicable Insteps 11 and 12 119872 becomes T-inconsistent again A theorylemma 119862

8= 1198972

and 1198973

rarr 1198971is then learned (transitivity

of equality) and then clause learning is performed whichlearns the empty clause The implication graph is shown inFigure 5

5 The Certificate Checking

Lemma 15 For any initial clause set 1198650 given a well-formed

certificate 119875 and 0 le 119894 le |119875| if 119875[119894] is a certificate item thatdefines a clause then 119865

0⊨T⟦119875[119894]⟧

Proof We prove that by induction The base case is easy toprove because for any initial item 119875[119894] that defines a clause119875[119894] isin 119865

0 Therefore it is trivial that 119865

0⊨ ⟦119875[119894]⟧ Thus

1198650⊨T⟦119875[119894]⟧

(i) For resolution certificate items assume that 119875[119894] =

Res1198621 1198622 119862

119899 By definition all 119862

119894are all

defined in lfloor119875rfloor119894minus1

Then by the inductive hypothesis

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 5

Table 1 A certificate example

Clause item ⟦119862⟧ Lemma1198621Define Init not119897

1or not1198972

not1198971

or not1198972

1198622Define Init 119897

11198971

1198623Define Lemma 119897

1rarr 1198972

not1198971

or 1198972

The property of ldquo=rdquo1198624Define Res 119862

1 1198623 1198622

◻ The empty clause

Example 6 Assume that 1198971

119909 = 119910 1198972

119891(119909) = 119891(119910) and1198621

not1198971

or not1198972 1198622

1198971 A certificate for the unsatisfiability of

the clause set 1198621 1198622 is presented in Table 1 In this example

1198621 1198622 is intuitively unsatisfiable Note that 119862

2entails 119897

1

which implies 1198972in TEUF while 119862

1 1198622

⊨ not1198972

Definition 7 (well-formed certificate) A certificate 119875 is wellformed if for all 1 le 119894 le |119875|

(i) if 119875[119894] is a Define item then one of the followingconditions hold(a) it is of Init type(b) it is of Res 119862

1 119862

119899 type where 119862

119896isin lfloor119875rfloor

119894minus1

for all 1 le 119896 le 119899 and there should be a linearregular resolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧

to 119875[119894](c) it is of Lemma type and the concrete clause is

valid in theory T that is ⊨T⟦119875[119894]⟧(ii) otherwise 119875[119894] is a Forget item and the referred clause

must be defined in lfloor119875rfloor119894minus1

4 Certificate Generation with DPLL(T)

The certificate generation algorithm is described in this sec-tion We first describe the abstract rule and implementationissues then prove some important properties and finally givea couple of examples

41 The CDPLL(T) Algorithm We describe here a certifi-cate generation procedure that can be well adapted to theDPLL(T) algorithm The extension of DPLL(T) with certifi-cate generation is called CDPLL(T)

Definition 8 (certificate refinement) The certificate obtainedby appending a certificate item 119862 to the end of the certificate119875 is denoted by 119875 ⊲ 119862

We use a transition system to model the CDPLL(T)algorithm Each state is represented as a 3-tuple ldquo119872 119865 oplus 119875rdquowhere119872 is the literal stack119865 the clause set to be satisfied and119875 the current certificateThe transition rules inCDPLL(T) arefollows

(i) Decide

Precondition(1) 119897 or not119897 occurs in a clause of 119865(2) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897d

119865 oplus 119875 (10)

(ii) UnitPropagate

Precondition

(1) 119872 ⊨ not119862(2) 119897 is undefined in 119872

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897 119865 119862 or 119897 oplus 119875 (11)

(iii) TheoryPropagate

Precondition

(1) 119872⊨T119897(2) 119897 or not119897 occurs in 119865(3) 119897 is undefined in 119872

119872 119865 oplus 119875 997891997888rarr 119872 119897 119865 oplus 119875 (12)

(iv) T-Backjump

Precondition

(1) 119872 119897d119873 ⊨ not119862

(2) there exists a clause 1198621015840or 1198971015840 such that (a) 119865 119862⊨T119862

1015840or 1198971015840

and 119872 ⊨ not1198621015840 (b) 119897

1015840 is undefined in 119872 and (c) 1198971015840 or

not1198971015840 occurs in 119865 or in 119872 119897

d119873

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

1015840or 1198971015840

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840 119865 119862 oplus (119875 ⊲ Res (119877)) (13)

(v) T-Learn(I) this rule models the clause learning proce-dure

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) 119865 ⊨ 119862

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to 119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Res (119877)) (14)

(vi) T-Learn(II) this rule models the learning of theorylemma

Precondition

(1) Each atom of 119862 occurs in 119865 or in 119872

(2) ⊨T119862

119872 119865 oplus 119875 997891997888rarr 119872 119865 119862 oplus (119875 ⊲ Lemma (119862)) (15)

6 Journal of Applied Mathematics

(vii) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 oplus 119875 997891997888rarr 119872 119865 oplus (119875 ⊲ Forget (119862)) (16)

(viii) Restart

Precondition

T

119872 119865 oplus 119875 997891997888rarr 0 119865 oplus 119875 (17)

(ix) Fail

Precondition

(1) 119872 ⊨ not119862

(2) 119872 contains no decision literals

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to ◻

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (18)

The initial state is 119872 119865 oplus 1198750where 119875

0is the certificate

that defines all initial clauses When a final state ⟨Fail⟩ oplus 119875

is reached 119875 is the obtained certificate for the unsatisfiabilityjudgment

The major differences between CDPLL(T) and DPLL(T)are as follows (1) Certificate generation is added Notice themodifications to rules T-Backjump T-Learn T-Forget andT-Fail (2) In order to generate well-formed certificates theoriginal T-Learn rule is split into 2 cases one for the learningof resolvents (corresponds to clause learning) and the otherfor the learning of theory lemmas (corresponds to deductionsinT)The first three rules are unchanged except the certificatepart

42 Implementation of CDPLL(T) In a real implementationof CDPLL(T) additional information need to be collectedto construct the certificate items In this section we discussthese details

Lemma 9 For any reachable state 119872 119865 oplus 119875 in CDPLL(T)and any clause 119862 isin 119865 there exists a clause item 119863 isin lfloor119875rfloor suchthat ⟦119863⟧ = 119862

Proof We prove that by induction Initially this propertyholds because all clauses in 119865

0are initial clauses and 119875

0

consists of all initial clause items (by definition) At each validtransition step the clause set 119865 and the certificate 119875 may bemodified However each rule ensures that whenever a clause119862 is added to 119865 there is always a clause item 119863 such that⟦119863⟧ = 119862 added to 119875 for example in the rule T-Learn(I)Furthermore whenever a clause item is removed from 119875

the corresponding clause is removed in 119865 for example in therule T-Forget Thus the property holds on all valid trace ofCDPLL(T)

421 Record Reasons For any literal 119897 in 119872 it is either addedby the Decide rule or forced by a clause or theory lemmaFor the latter case we need to record a clause item which isa deduction of the lemma and from which the literal 119897 canbe enforced We call this item the reason for literal 119897 beingin 119872

Only rules UnitPropagate TheoryPropagate and T-Backjump can enforce new literal to 119872 We discuss in thefollowing the reasons recorded by these three rules

(i) UnitPropagate the reason is a certificate item 119863 suchthat ⟦119863⟧ = 119862 or 119897 isin 119865 By Lemma 9 such item alwaysexists

(ii) TheoryPropagate assume that 119872 = 1198971

and 1198972

and sdot sdot sdot and 119897119898

and119872⊨T119897 then the reason is a certificate item119863 suchthat ⟦119863⟧ = not119897

1or not1198972

or not sdot sdot sdot or not119897119898

or 119897(iii) T-Backjump the reason for 119897

1015840 is a certificate item119863 such that ⟦119863⟧ = 119862

1015840or 1198971015840 We will show in the

following that such certificate item always exists whenT-Backjump is applicable

For convenience a subscript is used to denote the reason forexample 119897

(119863)denotes the literal 119897 asserted by the reason ⟦119863⟧

422 Learn Clauses In rules T-Backjump and Fail oncethere is a conflict a backjump clause needs be learned Fur-thermore if there are some branches that have not beenexplored T-Backjump is applied otherwise Fail is appliedEach clause learning procedure introduces some new certifi-cate items

We use implication graph to analyze the clause learningprocess Remember that in DPLL all literals are propositionalatoms the implication graph is simply a direct acyclic graph(DAG) where each edge corresponds to a deduction step InCDPLL(T) the implication graph has two kinds of deduc-tions propositional deduction (which is similar to DPLL)and theory deduction Each deduction step in the implicationgraph is labelled with a reason 119862This graph can be viewed asa generalized implication graph Without causing ambiguitywe still call it an implication graph

In the implication graph of DPLL each cut containing theconflict literals corresponds to a learned clause (by resolvingall the associated clauses of edges in this cut) [4] No matterwhich criteria (eg 1-UIP) is used the rule for clause learn-ing is applicable For CDPLL(T) when considering the gen-eralized implication graphs the clause learning rule is alsoapplicable

There are two situations where T-Backjump can be ap-plied Given a state 119872 119865 oplus 119875 the first situation is thatthere exists a clause 119862 isin 119865 which is falsified by 119872 that is⊭ 119872 119862 Then 119862 is called the conflict clause As in the rulesome backjump clause 119862

1015840or 1198971015840 should be learned by analyzing

the conflict Usually1198621015840or1198971015840 is a resolvent of a linear resolution

A certificate item with resolution type will be learned

Journal of Applied Mathematics 7

not1198971

1198971

1198970119901

1198622

1198620

1198623

Cut1 Cut2

Figure 2 Example of a generalized implication graph

The other situation is that 119872 itself becomes T-inconsist-ent Then a subset 119897

1 1198972 119897119898

of 119872 is unsatisfiable that is⊨Tnot1198971

or not1198972

or sdot sdot sdot or not119897119898 This is also a conflict clause It may

not belong to 119865 but it can definitely be learned from 119865 alongwith some theory lemmas The fact that 119897

1 119897119898leads to

conflict is represented in the generalized implication graphFurthermore the clause 119862

1015840or 1198971015840 that is required by the rule T-

Backjump can also be learned by generalized conflict clauseanalysis

In both situations the backjump clauses are learned byclause learning The certificate for unsatisfiability is actuallya well-organized collection of these clause items If Fail isapplicable no literal in119872 is decision variable then the learntclause is the empty clause which shows 119865⊨T◻

Example 10 A generalized implication graph is shown inFigure 2 where the literals are as follows 119897

0 119909 = 119910 119897

1

119891(119909) = 119891(119910) and 119901 is a propositional literal Clauses are asfollows

1198620

not1198970

or not1198971

= not (119909 = 119910) or (119891 (119909) = 119891 (119910))

1198621

1198970

or 119901 = 119909 = 119910 or 119901

1198622

1198970

or not119901 = 119909 = 119910 or not119901

(19)

Assume the current assignment 119872 = 119901 1198970 not1198971 Then

there is a conflict between not1198971and 1198971 In all there are 3

deductions in the implication graph 119901 rarr 1198970forced by 119862

2

1198970

rarr not1198971by 1198620and 1198970

rarr 1198971by the theory lemma 119862

3

Among those reasons solid lines correspond to existingclauses in119865 while dashed lines correspond to theory lemmas1198623is actually a theory lemma 119897

0rarr 1198971(119909 = 119910 rarr 119891(119909) =

119891(119910))In the clause learning procedure we start from the

conflict (not(1198971

and not1198971) = not119897

1or 1198971) and trace back (apply a

sequence of resolutions) If the 1st UIP schema [4] is usedit is possible to learn not119901 or not119897

0(corresponds to cut

1and

cut2 resp) The resolution steps for learning not119901 are shown

in Figure 3 Actually not119901 is the resolvent of 1198623 1198620 1198622 which

are exactly the reasons labeled on the edges from the conflictto the cut Among those 119862

0 1198622are normal clauses and 119862

3is

a theory lemmaBased on the above discussions the related rules can be

interpreted more precisely as follows

1198623 not1198970 or 1198971not1198970

1198971

1198970not119901

1198622 not119901 or 11989701198620 not1198970 or not1198971

Figure 3 Resolution steps for learning not119901

(i) TheoryPropagate

119872 119865 oplus 119875 997891997888rarr 119872 119897(119862)

119865 oplus 119875 (20)

The precondition requires that 119872⊨T119897 Let 1198721015840 be a subset of

119872 which entails 119897 (possibly 1198721015840

= 119872) Let 119862 = not1198721015840

or 119897 then⊨T119862 Furthermore 119872 119862⊨T119897 which means 119862 is a unit clauseunder the assignment 119872 Thus 119862 is the reason of 119897

(ii) UnitPropagate

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897(119862or119897)

119865 119862 or 119897 oplus 119875 (21)

The precondition requires that 119872 ⊨ not119862 thus 119862 or 119897 is a unitclause under 119872 So the reason of 119897 is 119862 or 119897

(iii) T-Backjump

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840

(1198621015840or1198971015840)

119865 119862 oplus (119875 ⊲ Res (119877))

(22)

The precondition requires that there exits some clause 1198621015840

or 1198971015840

such that 119865 119862⊨T1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840 The clause 1198621015840

or 1198971015840 is

then used as the reason for 1198971015840 As the clause learning is always

applicable [9] this clause always exists If the assignment119872 119897

d119873 is T-consistent then there must be some conflict

clause 119862 isin 119865 As explained above the clause learning willthen be applied and the learned clause will be 119862

1015840or 1198971015840 On the

other hand when 119872 119897d119873 is T-inconsistent clause learning

is also applicable in the generalized implication graph andgenerates a candidate 119862

1015840or 1198971015840 In both cases 119862

1015840or 1198971015840 can be

learned by applying linear regular resolutions on a clauseset 119877 [4] Moreover such 119877 is the union of a subset of reasonsin 119872 119897

d119873 and a group of theory lemmas

(iv) Fail

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (23)

The Fail rule is similar to T-Backjump except that the learnedclause is the empty clause

43 Properties of CDPLL(T) The soundness and complete-ness of CDPLL(T) are proven in this subsection

Lemma11 Awell-formed certificatemodified byT-BackjumpT-Learn(I) T-Learn(II) Fail or T-Forget is still wellformed

Proof Each of the 4 rules appends a new item to the cer-tificate Assume that the certificate 119875 is modified to 119875

1015840=

119875 ⊲ 119868 where 119868 is the appended certificate item The well-formed property of 119875

1015840 is checked by case studying the rulesapplied

8 Journal of Applied Mathematics

(i) For T-Backjump a certificate item of resolution typeis appended According to the precondition thecertificate items referred by 119868 are already defined inlfloor119875rfloor So 119875

1015840 is still well formed

(ii) For T-Learn(I) similar to the previous case The de-pendency of appended certificate item is satisfied

(iii) For T-Learn(II) a theory lemma item 119862 is appendedIt is required that ⊨T119862 Thus 119875

1015840 is still well formed

(iv) For Fail similar to the T-Backjump case

(v) For T-Forget the rule requires that the forgottenclause is in the clause set By Lemma 9 we know thatthere is a certificate item in 119875 which defines 119862 Thusthe well formedness is ensured

With the help of this lemma it can be proven thatCDPLL(T) procedure always generates well-formed certifi-cates

Lemma 12 Given any reachable state 119872 119865 oplus 119875 in anyCDPLL(T) procedure 119875 is a well-formed certificate

Proof First of all the initial certificate 1198750is well formed

because it only contains initial items Secondly the certificateis only modified in T-Backjump T-Learn(I) T-Learn(II)Fail and T-Forget By Lemma 11 these rules will preservewell formedness So following any path of CDPLL(T) willalways generate a well-formed certificate That completes theproof

Theorem 13 (soundness) Given a clause set 1198650 for any

certificate 119875 if 119872119909

119865119909

oplus 119875119909is reachable in a CDPLL(T)

procedure then 119872119909

119865119909is also reachable in a DPLL(T)

procedure Specially if ⟨Fail⟩ oplus 119875 is reachable in CDPLL(T)then ⟨Fail⟩ is also reachable in DPLL(T)

Proof For each transition step in CDPLL(T)

119872 119865 oplus 119875997888rarrCDPLL(T)1198721015840

1198651015840

oplus 1198751015840 (24)

it holds that

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (25)

So given a CDPLL(T) trace a trace in DPLL(T) can beobtained by removing the certificate part If 119872

119909 119865119909

oplus 119875119909

is reachable in CDPLL(T) so is 119872119909

119865119909in DPLL(T)

Theorem 14 (completeness) Given a clause set 1198650 if the state

119872119909

119865119909is reachable in a DPLL(T) procedure then there

must be a certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable

in a DPLL(T) procedure Furthermore if ⟨Fail⟩ is reachablein a DPLL(T) procedure then ⟨Fail⟩ oplus 119875

119909is reachable in a

CDPLL(T) and ◻ isin lfloor119875119909rfloor

Proof For rules Decide TheoryPropagate UnitPropagate T-Learn(II) T-Forget and Restart the preconditions are equalto those in CDPLL(T) So if there is a transition in DPLL(T)

labelled with one of these rules there could also be the sametransition in CDPLL(T)

For other rules T-Backjump T-Learn(I) and Fail weneed to prove that if 119872

119909 119865119909is reachable then there is some

certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable We prove

that by induction Initially this condition holds because theinitial state 119872

0 1198650corresponds to 119872

0 1198650

oplus 1198750 Inductively

if

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (26)

is a possible transition in DPLL(T) and there is some 119875 suchthat 119872 119865 oplus 119875 is reachable in CDPLL(T)

(i) If it is labelled with T-Learn(I) because the learnedclause is from clause learning and clause learningcorresponds to linear regular resolution It is alwayspossible that necessary theory lemmas in the impli-cation graph are learned at first by Learn(II) and getto a state 119872 119865 oplus 119875

10158401015840 on which the preconditions ofLearn(I) are satisfied Then 119872

1015840 1198651015840

oplus 1198751015840 is reached

(ii) If it is labelled with T-Backjump or Fail the situationis similar

Our approach can be well adapted to any DPLL(T)-basedSMT solver It is sound and complete regardless of the theorylearning scheme used or the order of the decision procedureIt works well as long as clause learning is based on thegeneralized implication graph

44 Examples A couple of examples are discussed in thissubsection The first example is given as a CNF formulaon propositional variables In this case the SMT problemreduces to an SAT problem No theory deduction is involvedin this example so we can get a general idea of the structureof certificates

Example 1 Consider the following clause set

1198621

= not1198971

or not1198972

or not1198973

1198622

= not1198971

or not1198972

or 1198973

1198623

= not1198971

or 1198972

1198624

= 1198971

or not1198972

1198625

= 1198971

or 1198972

(27)

Let 1198650

= 1198621 1198622 1198623 1198624 1198625 assume that 119875

0defines

everything in 1198650as initial clause then a possible CDPLL(T)

procedure is shown in Table 2 where ldquosimrdquo means this field isthe same as that in the previous row

In step 4 1198622

= not1198971

or not1198972

or 1198973is falsified By analysing

the implication graph a clause is learned from 1198621 1198622 1198623

Amonglsquothe referred clauses1198622is a falsified clause and119862

1 1198623

are in the reasons In step 7 1198624is falsified but there is no

decision variable By analysing the implication graph we canfind a resolution from 119862

4 1198625 1198626to the empty clause Among

the referred clauses 1198624is a falsified clause and 119862

5 1198626are in

the reasons

Journal of Applied Mathematics 9

Table 2 The CDPLL(T) procedure for Example 1

No 119872 119865 oplusP Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 119897

d1

10038171003817100381710038171003817sim oplus sim 0 Decide

3 119897d1 1198972

10038171003817100381710038171003817sim oplus sim 119897

2997891rarr 1198623 UnitPropagate

4 119897d1 1198972 not1198973

10038171003817100381710038171003817sim oplus sim

1198972

997891rarr 1198623

not1198973

997891rarr 1198621

UnitPropagate

now 1198622is falsified

5 not1198971

1003817100381710038171003817

1198650

1198626

= not1198971

oplus [1198750

Res1198621 1198622 1198623 = 119862

6

] not1198971

997891rarr 1198626

1198626

= not1198971is learned

T-Backjump

6 not1198971 not1198972

1003817100381710038171003817 sim oplus simnot1198971

997891rarr 1198626

not1198972

997891rarr 1198624

UnitPropagate

7 ⟨Fail⟩ sim oplus

[[[

[

1198750

Res1198621 1198622 1198623 = 119862

6

Res1198624 1198625 1198626 = ◻

]]]

]

0 Fail

Example 2 Consider the background theory TEUF and aclause set 119865 containing the following

(1198621) 119886 = 119887 or 119892 (119886) = 119892 (119887)

(1198622) ℎ (119886) = ℎ (119888) or 119901

(1198623) 119892 (119886) = 119892 (119887) or not119901

(1198624) ℎ (119886) = ℎ (119888) or 119886 = 119888

(1198625) 119891 (119886) = 119891 (119887)

(1198626) 119887 = 119888

(28)

Its boolean structure is

1198621

= 1198971

or not1198975

1198622

= not1198976

or 119901

1198623

= 1198975

or not119901

1198624

= 1198976

or 1198972

1198625

= not1198974

1198626

= 1198973

(29)

where

1198971 119886 = 119887

1198972 119886 = 119888

1198973 119887 = 119888

1198974 119891 (119886) = 119891 (119887)

1198975 119892 (119886) = 119892 (119887)

1198976 ℎ (119886) = ℎ (119888)

(30)

A possible CDPLL(T) procedure is shown in Table 3where the referred certificates are as follows

1198751

= [1198750

Lemma (not1198971

or 1198974) = 1198627

]

1198752

= [

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

]

]

1198753

=

[[[

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

Res 1198628 1198627 1198624 1198626 1198622 1198623 1198621 = ◻

]]]

]

(31)

In step 4 119872 becomes T-inconsistent The generalizedimplication graph is shown in Figure 4 A theory lemma119862

7=

not1198971

or 1198974is learned (instantiated the monotonicity property of

the equality) firstly then the backjump rule is applicable Insteps 11 and 12 119872 becomes T-inconsistent again A theorylemma 119862

8= 1198972

and 1198973

rarr 1198971is then learned (transitivity

of equality) and then clause learning is performed whichlearns the empty clause The implication graph is shown inFigure 5

5 The Certificate Checking

Lemma 15 For any initial clause set 1198650 given a well-formed

certificate 119875 and 0 le 119894 le |119875| if 119875[119894] is a certificate item thatdefines a clause then 119865

0⊨T⟦119875[119894]⟧

Proof We prove that by induction The base case is easy toprove because for any initial item 119875[119894] that defines a clause119875[119894] isin 119865

0 Therefore it is trivial that 119865

0⊨ ⟦119875[119894]⟧ Thus

1198650⊨T⟦119875[119894]⟧

(i) For resolution certificate items assume that 119875[119894] =

Res1198621 1198622 119862

119899 By definition all 119862

119894are all

defined in lfloor119875rfloor119894minus1

Then by the inductive hypothesis

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

6 Journal of Applied Mathematics

(vii) T-Forget

Precondition

119865⊨T119862

119872 119865 119862 oplus 119875 997891997888rarr 119872 119865 oplus (119875 ⊲ Forget (119862)) (16)

(viii) Restart

Precondition

T

119872 119865 oplus 119875 997891997888rarr 0 119865 oplus 119875 (17)

(ix) Fail

Precondition

(1) 119872 ⊨ not119862

(2) 119872 contains no decision literals

(3) 119877 = 1198621 1198622 119862

119899 is a set of clause items such that

119862119894

isin lfloor119875rfloor for all 119894 and there exists a linear regularresolution from ⟦119862

1⟧ ⟦1198622⟧ ⟦119862

119899⟧ to ◻

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (18)

The initial state is 119872 119865 oplus 1198750where 119875

0is the certificate

that defines all initial clauses When a final state ⟨Fail⟩ oplus 119875

is reached 119875 is the obtained certificate for the unsatisfiabilityjudgment

The major differences between CDPLL(T) and DPLL(T)are as follows (1) Certificate generation is added Notice themodifications to rules T-Backjump T-Learn T-Forget andT-Fail (2) In order to generate well-formed certificates theoriginal T-Learn rule is split into 2 cases one for the learningof resolvents (corresponds to clause learning) and the otherfor the learning of theory lemmas (corresponds to deductionsinT)The first three rules are unchanged except the certificatepart

42 Implementation of CDPLL(T) In a real implementationof CDPLL(T) additional information need to be collectedto construct the certificate items In this section we discussthese details

Lemma 9 For any reachable state 119872 119865 oplus 119875 in CDPLL(T)and any clause 119862 isin 119865 there exists a clause item 119863 isin lfloor119875rfloor suchthat ⟦119863⟧ = 119862

Proof We prove that by induction Initially this propertyholds because all clauses in 119865

0are initial clauses and 119875

0

consists of all initial clause items (by definition) At each validtransition step the clause set 119865 and the certificate 119875 may bemodified However each rule ensures that whenever a clause119862 is added to 119865 there is always a clause item 119863 such that⟦119863⟧ = 119862 added to 119875 for example in the rule T-Learn(I)Furthermore whenever a clause item is removed from 119875

the corresponding clause is removed in 119865 for example in therule T-Forget Thus the property holds on all valid trace ofCDPLL(T)

421 Record Reasons For any literal 119897 in 119872 it is either addedby the Decide rule or forced by a clause or theory lemmaFor the latter case we need to record a clause item which isa deduction of the lemma and from which the literal 119897 canbe enforced We call this item the reason for literal 119897 beingin 119872

Only rules UnitPropagate TheoryPropagate and T-Backjump can enforce new literal to 119872 We discuss in thefollowing the reasons recorded by these three rules

(i) UnitPropagate the reason is a certificate item 119863 suchthat ⟦119863⟧ = 119862 or 119897 isin 119865 By Lemma 9 such item alwaysexists

(ii) TheoryPropagate assume that 119872 = 1198971

and 1198972

and sdot sdot sdot and 119897119898

and119872⊨T119897 then the reason is a certificate item119863 suchthat ⟦119863⟧ = not119897

1or not1198972

or not sdot sdot sdot or not119897119898

or 119897(iii) T-Backjump the reason for 119897

1015840 is a certificate item119863 such that ⟦119863⟧ = 119862

1015840or 1198971015840 We will show in the

following that such certificate item always exists whenT-Backjump is applicable

For convenience a subscript is used to denote the reason forexample 119897

(119863)denotes the literal 119897 asserted by the reason ⟦119863⟧

422 Learn Clauses In rules T-Backjump and Fail oncethere is a conflict a backjump clause needs be learned Fur-thermore if there are some branches that have not beenexplored T-Backjump is applied otherwise Fail is appliedEach clause learning procedure introduces some new certifi-cate items

We use implication graph to analyze the clause learningprocess Remember that in DPLL all literals are propositionalatoms the implication graph is simply a direct acyclic graph(DAG) where each edge corresponds to a deduction step InCDPLL(T) the implication graph has two kinds of deduc-tions propositional deduction (which is similar to DPLL)and theory deduction Each deduction step in the implicationgraph is labelled with a reason 119862This graph can be viewed asa generalized implication graph Without causing ambiguitywe still call it an implication graph

In the implication graph of DPLL each cut containing theconflict literals corresponds to a learned clause (by resolvingall the associated clauses of edges in this cut) [4] No matterwhich criteria (eg 1-UIP) is used the rule for clause learn-ing is applicable For CDPLL(T) when considering the gen-eralized implication graphs the clause learning rule is alsoapplicable

There are two situations where T-Backjump can be ap-plied Given a state 119872 119865 oplus 119875 the first situation is thatthere exists a clause 119862 isin 119865 which is falsified by 119872 that is⊭ 119872 119862 Then 119862 is called the conflict clause As in the rulesome backjump clause 119862

1015840or 1198971015840 should be learned by analyzing

the conflict Usually1198621015840or1198971015840 is a resolvent of a linear resolution

A certificate item with resolution type will be learned

Journal of Applied Mathematics 7

not1198971

1198971

1198970119901

1198622

1198620

1198623

Cut1 Cut2

Figure 2 Example of a generalized implication graph

The other situation is that 119872 itself becomes T-inconsist-ent Then a subset 119897

1 1198972 119897119898

of 119872 is unsatisfiable that is⊨Tnot1198971

or not1198972

or sdot sdot sdot or not119897119898 This is also a conflict clause It may

not belong to 119865 but it can definitely be learned from 119865 alongwith some theory lemmas The fact that 119897

1 119897119898leads to

conflict is represented in the generalized implication graphFurthermore the clause 119862

1015840or 1198971015840 that is required by the rule T-

Backjump can also be learned by generalized conflict clauseanalysis

In both situations the backjump clauses are learned byclause learning The certificate for unsatisfiability is actuallya well-organized collection of these clause items If Fail isapplicable no literal in119872 is decision variable then the learntclause is the empty clause which shows 119865⊨T◻

Example 10 A generalized implication graph is shown inFigure 2 where the literals are as follows 119897

0 119909 = 119910 119897

1

119891(119909) = 119891(119910) and 119901 is a propositional literal Clauses are asfollows

1198620

not1198970

or not1198971

= not (119909 = 119910) or (119891 (119909) = 119891 (119910))

1198621

1198970

or 119901 = 119909 = 119910 or 119901

1198622

1198970

or not119901 = 119909 = 119910 or not119901

(19)

Assume the current assignment 119872 = 119901 1198970 not1198971 Then

there is a conflict between not1198971and 1198971 In all there are 3

deductions in the implication graph 119901 rarr 1198970forced by 119862

2

1198970

rarr not1198971by 1198620and 1198970

rarr 1198971by the theory lemma 119862

3

Among those reasons solid lines correspond to existingclauses in119865 while dashed lines correspond to theory lemmas1198623is actually a theory lemma 119897

0rarr 1198971(119909 = 119910 rarr 119891(119909) =

119891(119910))In the clause learning procedure we start from the

conflict (not(1198971

and not1198971) = not119897

1or 1198971) and trace back (apply a

sequence of resolutions) If the 1st UIP schema [4] is usedit is possible to learn not119901 or not119897

0(corresponds to cut

1and

cut2 resp) The resolution steps for learning not119901 are shown

in Figure 3 Actually not119901 is the resolvent of 1198623 1198620 1198622 which

are exactly the reasons labeled on the edges from the conflictto the cut Among those 119862

0 1198622are normal clauses and 119862

3is

a theory lemmaBased on the above discussions the related rules can be

interpreted more precisely as follows

1198623 not1198970 or 1198971not1198970

1198971

1198970not119901

1198622 not119901 or 11989701198620 not1198970 or not1198971

Figure 3 Resolution steps for learning not119901

(i) TheoryPropagate

119872 119865 oplus 119875 997891997888rarr 119872 119897(119862)

119865 oplus 119875 (20)

The precondition requires that 119872⊨T119897 Let 1198721015840 be a subset of

119872 which entails 119897 (possibly 1198721015840

= 119872) Let 119862 = not1198721015840

or 119897 then⊨T119862 Furthermore 119872 119862⊨T119897 which means 119862 is a unit clauseunder the assignment 119872 Thus 119862 is the reason of 119897

(ii) UnitPropagate

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897(119862or119897)

119865 119862 or 119897 oplus 119875 (21)

The precondition requires that 119872 ⊨ not119862 thus 119862 or 119897 is a unitclause under 119872 So the reason of 119897 is 119862 or 119897

(iii) T-Backjump

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840

(1198621015840or1198971015840)

119865 119862 oplus (119875 ⊲ Res (119877))

(22)

The precondition requires that there exits some clause 1198621015840

or 1198971015840

such that 119865 119862⊨T1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840 The clause 1198621015840

or 1198971015840 is

then used as the reason for 1198971015840 As the clause learning is always

applicable [9] this clause always exists If the assignment119872 119897

d119873 is T-consistent then there must be some conflict

clause 119862 isin 119865 As explained above the clause learning willthen be applied and the learned clause will be 119862

1015840or 1198971015840 On the

other hand when 119872 119897d119873 is T-inconsistent clause learning

is also applicable in the generalized implication graph andgenerates a candidate 119862

1015840or 1198971015840 In both cases 119862

1015840or 1198971015840 can be

learned by applying linear regular resolutions on a clauseset 119877 [4] Moreover such 119877 is the union of a subset of reasonsin 119872 119897

d119873 and a group of theory lemmas

(iv) Fail

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (23)

The Fail rule is similar to T-Backjump except that the learnedclause is the empty clause

43 Properties of CDPLL(T) The soundness and complete-ness of CDPLL(T) are proven in this subsection

Lemma11 Awell-formed certificatemodified byT-BackjumpT-Learn(I) T-Learn(II) Fail or T-Forget is still wellformed

Proof Each of the 4 rules appends a new item to the cer-tificate Assume that the certificate 119875 is modified to 119875

1015840=

119875 ⊲ 119868 where 119868 is the appended certificate item The well-formed property of 119875

1015840 is checked by case studying the rulesapplied

8 Journal of Applied Mathematics

(i) For T-Backjump a certificate item of resolution typeis appended According to the precondition thecertificate items referred by 119868 are already defined inlfloor119875rfloor So 119875

1015840 is still well formed

(ii) For T-Learn(I) similar to the previous case The de-pendency of appended certificate item is satisfied

(iii) For T-Learn(II) a theory lemma item 119862 is appendedIt is required that ⊨T119862 Thus 119875

1015840 is still well formed

(iv) For Fail similar to the T-Backjump case

(v) For T-Forget the rule requires that the forgottenclause is in the clause set By Lemma 9 we know thatthere is a certificate item in 119875 which defines 119862 Thusthe well formedness is ensured

With the help of this lemma it can be proven thatCDPLL(T) procedure always generates well-formed certifi-cates

Lemma 12 Given any reachable state 119872 119865 oplus 119875 in anyCDPLL(T) procedure 119875 is a well-formed certificate

Proof First of all the initial certificate 1198750is well formed

because it only contains initial items Secondly the certificateis only modified in T-Backjump T-Learn(I) T-Learn(II)Fail and T-Forget By Lemma 11 these rules will preservewell formedness So following any path of CDPLL(T) willalways generate a well-formed certificate That completes theproof

Theorem 13 (soundness) Given a clause set 1198650 for any

certificate 119875 if 119872119909

119865119909

oplus 119875119909is reachable in a CDPLL(T)

procedure then 119872119909

119865119909is also reachable in a DPLL(T)

procedure Specially if ⟨Fail⟩ oplus 119875 is reachable in CDPLL(T)then ⟨Fail⟩ is also reachable in DPLL(T)

Proof For each transition step in CDPLL(T)

119872 119865 oplus 119875997888rarrCDPLL(T)1198721015840

1198651015840

oplus 1198751015840 (24)

it holds that

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (25)

So given a CDPLL(T) trace a trace in DPLL(T) can beobtained by removing the certificate part If 119872

119909 119865119909

oplus 119875119909

is reachable in CDPLL(T) so is 119872119909

119865119909in DPLL(T)

Theorem 14 (completeness) Given a clause set 1198650 if the state

119872119909

119865119909is reachable in a DPLL(T) procedure then there

must be a certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable

in a DPLL(T) procedure Furthermore if ⟨Fail⟩ is reachablein a DPLL(T) procedure then ⟨Fail⟩ oplus 119875

119909is reachable in a

CDPLL(T) and ◻ isin lfloor119875119909rfloor

Proof For rules Decide TheoryPropagate UnitPropagate T-Learn(II) T-Forget and Restart the preconditions are equalto those in CDPLL(T) So if there is a transition in DPLL(T)

labelled with one of these rules there could also be the sametransition in CDPLL(T)

For other rules T-Backjump T-Learn(I) and Fail weneed to prove that if 119872

119909 119865119909is reachable then there is some

certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable We prove

that by induction Initially this condition holds because theinitial state 119872

0 1198650corresponds to 119872

0 1198650

oplus 1198750 Inductively

if

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (26)

is a possible transition in DPLL(T) and there is some 119875 suchthat 119872 119865 oplus 119875 is reachable in CDPLL(T)

(i) If it is labelled with T-Learn(I) because the learnedclause is from clause learning and clause learningcorresponds to linear regular resolution It is alwayspossible that necessary theory lemmas in the impli-cation graph are learned at first by Learn(II) and getto a state 119872 119865 oplus 119875

10158401015840 on which the preconditions ofLearn(I) are satisfied Then 119872

1015840 1198651015840

oplus 1198751015840 is reached

(ii) If it is labelled with T-Backjump or Fail the situationis similar

Our approach can be well adapted to any DPLL(T)-basedSMT solver It is sound and complete regardless of the theorylearning scheme used or the order of the decision procedureIt works well as long as clause learning is based on thegeneralized implication graph

44 Examples A couple of examples are discussed in thissubsection The first example is given as a CNF formulaon propositional variables In this case the SMT problemreduces to an SAT problem No theory deduction is involvedin this example so we can get a general idea of the structureof certificates

Example 1 Consider the following clause set

1198621

= not1198971

or not1198972

or not1198973

1198622

= not1198971

or not1198972

or 1198973

1198623

= not1198971

or 1198972

1198624

= 1198971

or not1198972

1198625

= 1198971

or 1198972

(27)

Let 1198650

= 1198621 1198622 1198623 1198624 1198625 assume that 119875

0defines

everything in 1198650as initial clause then a possible CDPLL(T)

procedure is shown in Table 2 where ldquosimrdquo means this field isthe same as that in the previous row

In step 4 1198622

= not1198971

or not1198972

or 1198973is falsified By analysing

the implication graph a clause is learned from 1198621 1198622 1198623

Amonglsquothe referred clauses1198622is a falsified clause and119862

1 1198623

are in the reasons In step 7 1198624is falsified but there is no

decision variable By analysing the implication graph we canfind a resolution from 119862

4 1198625 1198626to the empty clause Among

the referred clauses 1198624is a falsified clause and 119862

5 1198626are in

the reasons

Journal of Applied Mathematics 9

Table 2 The CDPLL(T) procedure for Example 1

No 119872 119865 oplusP Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 119897

d1

10038171003817100381710038171003817sim oplus sim 0 Decide

3 119897d1 1198972

10038171003817100381710038171003817sim oplus sim 119897

2997891rarr 1198623 UnitPropagate

4 119897d1 1198972 not1198973

10038171003817100381710038171003817sim oplus sim

1198972

997891rarr 1198623

not1198973

997891rarr 1198621

UnitPropagate

now 1198622is falsified

5 not1198971

1003817100381710038171003817

1198650

1198626

= not1198971

oplus [1198750

Res1198621 1198622 1198623 = 119862

6

] not1198971

997891rarr 1198626

1198626

= not1198971is learned

T-Backjump

6 not1198971 not1198972

1003817100381710038171003817 sim oplus simnot1198971

997891rarr 1198626

not1198972

997891rarr 1198624

UnitPropagate

7 ⟨Fail⟩ sim oplus

[[[

[

1198750

Res1198621 1198622 1198623 = 119862

6

Res1198624 1198625 1198626 = ◻

]]]

]

0 Fail

Example 2 Consider the background theory TEUF and aclause set 119865 containing the following

(1198621) 119886 = 119887 or 119892 (119886) = 119892 (119887)

(1198622) ℎ (119886) = ℎ (119888) or 119901

(1198623) 119892 (119886) = 119892 (119887) or not119901

(1198624) ℎ (119886) = ℎ (119888) or 119886 = 119888

(1198625) 119891 (119886) = 119891 (119887)

(1198626) 119887 = 119888

(28)

Its boolean structure is

1198621

= 1198971

or not1198975

1198622

= not1198976

or 119901

1198623

= 1198975

or not119901

1198624

= 1198976

or 1198972

1198625

= not1198974

1198626

= 1198973

(29)

where

1198971 119886 = 119887

1198972 119886 = 119888

1198973 119887 = 119888

1198974 119891 (119886) = 119891 (119887)

1198975 119892 (119886) = 119892 (119887)

1198976 ℎ (119886) = ℎ (119888)

(30)

A possible CDPLL(T) procedure is shown in Table 3where the referred certificates are as follows

1198751

= [1198750

Lemma (not1198971

or 1198974) = 1198627

]

1198752

= [

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

]

]

1198753

=

[[[

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

Res 1198628 1198627 1198624 1198626 1198622 1198623 1198621 = ◻

]]]

]

(31)

In step 4 119872 becomes T-inconsistent The generalizedimplication graph is shown in Figure 4 A theory lemma119862

7=

not1198971

or 1198974is learned (instantiated the monotonicity property of

the equality) firstly then the backjump rule is applicable Insteps 11 and 12 119872 becomes T-inconsistent again A theorylemma 119862

8= 1198972

and 1198973

rarr 1198971is then learned (transitivity

of equality) and then clause learning is performed whichlearns the empty clause The implication graph is shown inFigure 5

5 The Certificate Checking

Lemma 15 For any initial clause set 1198650 given a well-formed

certificate 119875 and 0 le 119894 le |119875| if 119875[119894] is a certificate item thatdefines a clause then 119865

0⊨T⟦119875[119894]⟧

Proof We prove that by induction The base case is easy toprove because for any initial item 119875[119894] that defines a clause119875[119894] isin 119865

0 Therefore it is trivial that 119865

0⊨ ⟦119875[119894]⟧ Thus

1198650⊨T⟦119875[119894]⟧

(i) For resolution certificate items assume that 119875[119894] =

Res1198621 1198622 119862

119899 By definition all 119862

119894are all

defined in lfloor119875rfloor119894minus1

Then by the inductive hypothesis

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 7

not1198971

1198971

1198970119901

1198622

1198620

1198623

Cut1 Cut2

Figure 2 Example of a generalized implication graph

The other situation is that 119872 itself becomes T-inconsist-ent Then a subset 119897

1 1198972 119897119898

of 119872 is unsatisfiable that is⊨Tnot1198971

or not1198972

or sdot sdot sdot or not119897119898 This is also a conflict clause It may

not belong to 119865 but it can definitely be learned from 119865 alongwith some theory lemmas The fact that 119897

1 119897119898leads to

conflict is represented in the generalized implication graphFurthermore the clause 119862

1015840or 1198971015840 that is required by the rule T-

Backjump can also be learned by generalized conflict clauseanalysis

In both situations the backjump clauses are learned byclause learning The certificate for unsatisfiability is actuallya well-organized collection of these clause items If Fail isapplicable no literal in119872 is decision variable then the learntclause is the empty clause which shows 119865⊨T◻

Example 10 A generalized implication graph is shown inFigure 2 where the literals are as follows 119897

0 119909 = 119910 119897

1

119891(119909) = 119891(119910) and 119901 is a propositional literal Clauses are asfollows

1198620

not1198970

or not1198971

= not (119909 = 119910) or (119891 (119909) = 119891 (119910))

1198621

1198970

or 119901 = 119909 = 119910 or 119901

1198622

1198970

or not119901 = 119909 = 119910 or not119901

(19)

Assume the current assignment 119872 = 119901 1198970 not1198971 Then

there is a conflict between not1198971and 1198971 In all there are 3

deductions in the implication graph 119901 rarr 1198970forced by 119862

2

1198970

rarr not1198971by 1198620and 1198970

rarr 1198971by the theory lemma 119862

3

Among those reasons solid lines correspond to existingclauses in119865 while dashed lines correspond to theory lemmas1198623is actually a theory lemma 119897

0rarr 1198971(119909 = 119910 rarr 119891(119909) =

119891(119910))In the clause learning procedure we start from the

conflict (not(1198971

and not1198971) = not119897

1or 1198971) and trace back (apply a

sequence of resolutions) If the 1st UIP schema [4] is usedit is possible to learn not119901 or not119897

0(corresponds to cut

1and

cut2 resp) The resolution steps for learning not119901 are shown

in Figure 3 Actually not119901 is the resolvent of 1198623 1198620 1198622 which

are exactly the reasons labeled on the edges from the conflictto the cut Among those 119862

0 1198622are normal clauses and 119862

3is

a theory lemmaBased on the above discussions the related rules can be

interpreted more precisely as follows

1198623 not1198970 or 1198971not1198970

1198971

1198970not119901

1198622 not119901 or 11989701198620 not1198970 or not1198971

Figure 3 Resolution steps for learning not119901

(i) TheoryPropagate

119872 119865 oplus 119875 997891997888rarr 119872 119897(119862)

119865 oplus 119875 (20)

The precondition requires that 119872⊨T119897 Let 1198721015840 be a subset of

119872 which entails 119897 (possibly 1198721015840

= 119872) Let 119862 = not1198721015840

or 119897 then⊨T119862 Furthermore 119872 119862⊨T119897 which means 119862 is a unit clauseunder the assignment 119872 Thus 119862 is the reason of 119897

(ii) UnitPropagate

119872 119865 119862 or 119897 oplus 119875 997891997888rarr 119872 119897(119862or119897)

119865 119862 or 119897 oplus 119875 (21)

The precondition requires that 119872 ⊨ not119862 thus 119862 or 119897 is a unitclause under 119872 So the reason of 119897 is 119862 or 119897

(iii) T-Backjump

119872 119897d119873 119865 119862 oplus 119875 997891997888rarr 119872 119897

1015840

(1198621015840or1198971015840)

119865 119862 oplus (119875 ⊲ Res (119877))

(22)

The precondition requires that there exits some clause 1198621015840

or 1198971015840

such that 119865 119862⊨T1198621015840

or 1198971015840 and 119872 ⊨ not119862

1015840 The clause 1198621015840

or 1198971015840 is

then used as the reason for 1198971015840 As the clause learning is always

applicable [9] this clause always exists If the assignment119872 119897

d119873 is T-consistent then there must be some conflict

clause 119862 isin 119865 As explained above the clause learning willthen be applied and the learned clause will be 119862

1015840or 1198971015840 On the

other hand when 119872 119897d119873 is T-inconsistent clause learning

is also applicable in the generalized implication graph andgenerates a candidate 119862

1015840or 1198971015840 In both cases 119862

1015840or 1198971015840 can be

learned by applying linear regular resolutions on a clauseset 119877 [4] Moreover such 119877 is the union of a subset of reasonsin 119872 119897

d119873 and a group of theory lemmas

(iv) Fail

119872 119865 119862 oplus 119875 997891997888rarr ⟨Fail⟩ oplus (119875 ⊲ Res (119877)) (23)

The Fail rule is similar to T-Backjump except that the learnedclause is the empty clause

43 Properties of CDPLL(T) The soundness and complete-ness of CDPLL(T) are proven in this subsection

Lemma11 Awell-formed certificatemodified byT-BackjumpT-Learn(I) T-Learn(II) Fail or T-Forget is still wellformed

Proof Each of the 4 rules appends a new item to the cer-tificate Assume that the certificate 119875 is modified to 119875

1015840=

119875 ⊲ 119868 where 119868 is the appended certificate item The well-formed property of 119875

1015840 is checked by case studying the rulesapplied

8 Journal of Applied Mathematics

(i) For T-Backjump a certificate item of resolution typeis appended According to the precondition thecertificate items referred by 119868 are already defined inlfloor119875rfloor So 119875

1015840 is still well formed

(ii) For T-Learn(I) similar to the previous case The de-pendency of appended certificate item is satisfied

(iii) For T-Learn(II) a theory lemma item 119862 is appendedIt is required that ⊨T119862 Thus 119875

1015840 is still well formed

(iv) For Fail similar to the T-Backjump case

(v) For T-Forget the rule requires that the forgottenclause is in the clause set By Lemma 9 we know thatthere is a certificate item in 119875 which defines 119862 Thusthe well formedness is ensured

With the help of this lemma it can be proven thatCDPLL(T) procedure always generates well-formed certifi-cates

Lemma 12 Given any reachable state 119872 119865 oplus 119875 in anyCDPLL(T) procedure 119875 is a well-formed certificate

Proof First of all the initial certificate 1198750is well formed

because it only contains initial items Secondly the certificateis only modified in T-Backjump T-Learn(I) T-Learn(II)Fail and T-Forget By Lemma 11 these rules will preservewell formedness So following any path of CDPLL(T) willalways generate a well-formed certificate That completes theproof

Theorem 13 (soundness) Given a clause set 1198650 for any

certificate 119875 if 119872119909

119865119909

oplus 119875119909is reachable in a CDPLL(T)

procedure then 119872119909

119865119909is also reachable in a DPLL(T)

procedure Specially if ⟨Fail⟩ oplus 119875 is reachable in CDPLL(T)then ⟨Fail⟩ is also reachable in DPLL(T)

Proof For each transition step in CDPLL(T)

119872 119865 oplus 119875997888rarrCDPLL(T)1198721015840

1198651015840

oplus 1198751015840 (24)

it holds that

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (25)

So given a CDPLL(T) trace a trace in DPLL(T) can beobtained by removing the certificate part If 119872

119909 119865119909

oplus 119875119909

is reachable in CDPLL(T) so is 119872119909

119865119909in DPLL(T)

Theorem 14 (completeness) Given a clause set 1198650 if the state

119872119909

119865119909is reachable in a DPLL(T) procedure then there

must be a certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable

in a DPLL(T) procedure Furthermore if ⟨Fail⟩ is reachablein a DPLL(T) procedure then ⟨Fail⟩ oplus 119875

119909is reachable in a

CDPLL(T) and ◻ isin lfloor119875119909rfloor

Proof For rules Decide TheoryPropagate UnitPropagate T-Learn(II) T-Forget and Restart the preconditions are equalto those in CDPLL(T) So if there is a transition in DPLL(T)

labelled with one of these rules there could also be the sametransition in CDPLL(T)

For other rules T-Backjump T-Learn(I) and Fail weneed to prove that if 119872

119909 119865119909is reachable then there is some

certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable We prove

that by induction Initially this condition holds because theinitial state 119872

0 1198650corresponds to 119872

0 1198650

oplus 1198750 Inductively

if

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (26)

is a possible transition in DPLL(T) and there is some 119875 suchthat 119872 119865 oplus 119875 is reachable in CDPLL(T)

(i) If it is labelled with T-Learn(I) because the learnedclause is from clause learning and clause learningcorresponds to linear regular resolution It is alwayspossible that necessary theory lemmas in the impli-cation graph are learned at first by Learn(II) and getto a state 119872 119865 oplus 119875

10158401015840 on which the preconditions ofLearn(I) are satisfied Then 119872

1015840 1198651015840

oplus 1198751015840 is reached

(ii) If it is labelled with T-Backjump or Fail the situationis similar

Our approach can be well adapted to any DPLL(T)-basedSMT solver It is sound and complete regardless of the theorylearning scheme used or the order of the decision procedureIt works well as long as clause learning is based on thegeneralized implication graph

44 Examples A couple of examples are discussed in thissubsection The first example is given as a CNF formulaon propositional variables In this case the SMT problemreduces to an SAT problem No theory deduction is involvedin this example so we can get a general idea of the structureof certificates

Example 1 Consider the following clause set

1198621

= not1198971

or not1198972

or not1198973

1198622

= not1198971

or not1198972

or 1198973

1198623

= not1198971

or 1198972

1198624

= 1198971

or not1198972

1198625

= 1198971

or 1198972

(27)

Let 1198650

= 1198621 1198622 1198623 1198624 1198625 assume that 119875

0defines

everything in 1198650as initial clause then a possible CDPLL(T)

procedure is shown in Table 2 where ldquosimrdquo means this field isthe same as that in the previous row

In step 4 1198622

= not1198971

or not1198972

or 1198973is falsified By analysing

the implication graph a clause is learned from 1198621 1198622 1198623

Amonglsquothe referred clauses1198622is a falsified clause and119862

1 1198623

are in the reasons In step 7 1198624is falsified but there is no

decision variable By analysing the implication graph we canfind a resolution from 119862

4 1198625 1198626to the empty clause Among

the referred clauses 1198624is a falsified clause and 119862

5 1198626are in

the reasons

Journal of Applied Mathematics 9

Table 2 The CDPLL(T) procedure for Example 1

No 119872 119865 oplusP Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 119897

d1

10038171003817100381710038171003817sim oplus sim 0 Decide

3 119897d1 1198972

10038171003817100381710038171003817sim oplus sim 119897

2997891rarr 1198623 UnitPropagate

4 119897d1 1198972 not1198973

10038171003817100381710038171003817sim oplus sim

1198972

997891rarr 1198623

not1198973

997891rarr 1198621

UnitPropagate

now 1198622is falsified

5 not1198971

1003817100381710038171003817

1198650

1198626

= not1198971

oplus [1198750

Res1198621 1198622 1198623 = 119862

6

] not1198971

997891rarr 1198626

1198626

= not1198971is learned

T-Backjump

6 not1198971 not1198972

1003817100381710038171003817 sim oplus simnot1198971

997891rarr 1198626

not1198972

997891rarr 1198624

UnitPropagate

7 ⟨Fail⟩ sim oplus

[[[

[

1198750

Res1198621 1198622 1198623 = 119862

6

Res1198624 1198625 1198626 = ◻

]]]

]

0 Fail

Example 2 Consider the background theory TEUF and aclause set 119865 containing the following

(1198621) 119886 = 119887 or 119892 (119886) = 119892 (119887)

(1198622) ℎ (119886) = ℎ (119888) or 119901

(1198623) 119892 (119886) = 119892 (119887) or not119901

(1198624) ℎ (119886) = ℎ (119888) or 119886 = 119888

(1198625) 119891 (119886) = 119891 (119887)

(1198626) 119887 = 119888

(28)

Its boolean structure is

1198621

= 1198971

or not1198975

1198622

= not1198976

or 119901

1198623

= 1198975

or not119901

1198624

= 1198976

or 1198972

1198625

= not1198974

1198626

= 1198973

(29)

where

1198971 119886 = 119887

1198972 119886 = 119888

1198973 119887 = 119888

1198974 119891 (119886) = 119891 (119887)

1198975 119892 (119886) = 119892 (119887)

1198976 ℎ (119886) = ℎ (119888)

(30)

A possible CDPLL(T) procedure is shown in Table 3where the referred certificates are as follows

1198751

= [1198750

Lemma (not1198971

or 1198974) = 1198627

]

1198752

= [

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

]

]

1198753

=

[[[

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

Res 1198628 1198627 1198624 1198626 1198622 1198623 1198621 = ◻

]]]

]

(31)

In step 4 119872 becomes T-inconsistent The generalizedimplication graph is shown in Figure 4 A theory lemma119862

7=

not1198971

or 1198974is learned (instantiated the monotonicity property of

the equality) firstly then the backjump rule is applicable Insteps 11 and 12 119872 becomes T-inconsistent again A theorylemma 119862

8= 1198972

and 1198973

rarr 1198971is then learned (transitivity

of equality) and then clause learning is performed whichlearns the empty clause The implication graph is shown inFigure 5

5 The Certificate Checking

Lemma 15 For any initial clause set 1198650 given a well-formed

certificate 119875 and 0 le 119894 le |119875| if 119875[119894] is a certificate item thatdefines a clause then 119865

0⊨T⟦119875[119894]⟧

Proof We prove that by induction The base case is easy toprove because for any initial item 119875[119894] that defines a clause119875[119894] isin 119865

0 Therefore it is trivial that 119865

0⊨ ⟦119875[119894]⟧ Thus

1198650⊨T⟦119875[119894]⟧

(i) For resolution certificate items assume that 119875[119894] =

Res1198621 1198622 119862

119899 By definition all 119862

119894are all

defined in lfloor119875rfloor119894minus1

Then by the inductive hypothesis

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

8 Journal of Applied Mathematics

(i) For T-Backjump a certificate item of resolution typeis appended According to the precondition thecertificate items referred by 119868 are already defined inlfloor119875rfloor So 119875

1015840 is still well formed

(ii) For T-Learn(I) similar to the previous case The de-pendency of appended certificate item is satisfied

(iii) For T-Learn(II) a theory lemma item 119862 is appendedIt is required that ⊨T119862 Thus 119875

1015840 is still well formed

(iv) For Fail similar to the T-Backjump case

(v) For T-Forget the rule requires that the forgottenclause is in the clause set By Lemma 9 we know thatthere is a certificate item in 119875 which defines 119862 Thusthe well formedness is ensured

With the help of this lemma it can be proven thatCDPLL(T) procedure always generates well-formed certifi-cates

Lemma 12 Given any reachable state 119872 119865 oplus 119875 in anyCDPLL(T) procedure 119875 is a well-formed certificate

Proof First of all the initial certificate 1198750is well formed

because it only contains initial items Secondly the certificateis only modified in T-Backjump T-Learn(I) T-Learn(II)Fail and T-Forget By Lemma 11 these rules will preservewell formedness So following any path of CDPLL(T) willalways generate a well-formed certificate That completes theproof

Theorem 13 (soundness) Given a clause set 1198650 for any

certificate 119875 if 119872119909

119865119909

oplus 119875119909is reachable in a CDPLL(T)

procedure then 119872119909

119865119909is also reachable in a DPLL(T)

procedure Specially if ⟨Fail⟩ oplus 119875 is reachable in CDPLL(T)then ⟨Fail⟩ is also reachable in DPLL(T)

Proof For each transition step in CDPLL(T)

119872 119865 oplus 119875997888rarrCDPLL(T)1198721015840

1198651015840

oplus 1198751015840 (24)

it holds that

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (25)

So given a CDPLL(T) trace a trace in DPLL(T) can beobtained by removing the certificate part If 119872

119909 119865119909

oplus 119875119909

is reachable in CDPLL(T) so is 119872119909

119865119909in DPLL(T)

Theorem 14 (completeness) Given a clause set 1198650 if the state

119872119909

119865119909is reachable in a DPLL(T) procedure then there

must be a certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable

in a DPLL(T) procedure Furthermore if ⟨Fail⟩ is reachablein a DPLL(T) procedure then ⟨Fail⟩ oplus 119875

119909is reachable in a

CDPLL(T) and ◻ isin lfloor119875119909rfloor

Proof For rules Decide TheoryPropagate UnitPropagate T-Learn(II) T-Forget and Restart the preconditions are equalto those in CDPLL(T) So if there is a transition in DPLL(T)

labelled with one of these rules there could also be the sametransition in CDPLL(T)

For other rules T-Backjump T-Learn(I) and Fail weneed to prove that if 119872

119909 119865119909is reachable then there is some

certificate 119875119909such that 119872

119909 119865119909

oplus 119875119909is reachable We prove

that by induction Initially this condition holds because theinitial state 119872

0 1198650corresponds to 119872

0 1198650

oplus 1198750 Inductively

if

119872 119865997888rarrDPLL(T)1198721015840

1198651015840 (26)

is a possible transition in DPLL(T) and there is some 119875 suchthat 119872 119865 oplus 119875 is reachable in CDPLL(T)

(i) If it is labelled with T-Learn(I) because the learnedclause is from clause learning and clause learningcorresponds to linear regular resolution It is alwayspossible that necessary theory lemmas in the impli-cation graph are learned at first by Learn(II) and getto a state 119872 119865 oplus 119875

10158401015840 on which the preconditions ofLearn(I) are satisfied Then 119872

1015840 1198651015840

oplus 1198751015840 is reached

(ii) If it is labelled with T-Backjump or Fail the situationis similar

Our approach can be well adapted to any DPLL(T)-basedSMT solver It is sound and complete regardless of the theorylearning scheme used or the order of the decision procedureIt works well as long as clause learning is based on thegeneralized implication graph

44 Examples A couple of examples are discussed in thissubsection The first example is given as a CNF formulaon propositional variables In this case the SMT problemreduces to an SAT problem No theory deduction is involvedin this example so we can get a general idea of the structureof certificates

Example 1 Consider the following clause set

1198621

= not1198971

or not1198972

or not1198973

1198622

= not1198971

or not1198972

or 1198973

1198623

= not1198971

or 1198972

1198624

= 1198971

or not1198972

1198625

= 1198971

or 1198972

(27)

Let 1198650

= 1198621 1198622 1198623 1198624 1198625 assume that 119875

0defines

everything in 1198650as initial clause then a possible CDPLL(T)

procedure is shown in Table 2 where ldquosimrdquo means this field isthe same as that in the previous row

In step 4 1198622

= not1198971

or not1198972

or 1198973is falsified By analysing

the implication graph a clause is learned from 1198621 1198622 1198623

Amonglsquothe referred clauses1198622is a falsified clause and119862

1 1198623

are in the reasons In step 7 1198624is falsified but there is no

decision variable By analysing the implication graph we canfind a resolution from 119862

4 1198625 1198626to the empty clause Among

the referred clauses 1198624is a falsified clause and 119862

5 1198626are in

the reasons

Journal of Applied Mathematics 9

Table 2 The CDPLL(T) procedure for Example 1

No 119872 119865 oplusP Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 119897

d1

10038171003817100381710038171003817sim oplus sim 0 Decide

3 119897d1 1198972

10038171003817100381710038171003817sim oplus sim 119897

2997891rarr 1198623 UnitPropagate

4 119897d1 1198972 not1198973

10038171003817100381710038171003817sim oplus sim

1198972

997891rarr 1198623

not1198973

997891rarr 1198621

UnitPropagate

now 1198622is falsified

5 not1198971

1003817100381710038171003817

1198650

1198626

= not1198971

oplus [1198750

Res1198621 1198622 1198623 = 119862

6

] not1198971

997891rarr 1198626

1198626

= not1198971is learned

T-Backjump

6 not1198971 not1198972

1003817100381710038171003817 sim oplus simnot1198971

997891rarr 1198626

not1198972

997891rarr 1198624

UnitPropagate

7 ⟨Fail⟩ sim oplus

[[[

[

1198750

Res1198621 1198622 1198623 = 119862

6

Res1198624 1198625 1198626 = ◻

]]]

]

0 Fail

Example 2 Consider the background theory TEUF and aclause set 119865 containing the following

(1198621) 119886 = 119887 or 119892 (119886) = 119892 (119887)

(1198622) ℎ (119886) = ℎ (119888) or 119901

(1198623) 119892 (119886) = 119892 (119887) or not119901

(1198624) ℎ (119886) = ℎ (119888) or 119886 = 119888

(1198625) 119891 (119886) = 119891 (119887)

(1198626) 119887 = 119888

(28)

Its boolean structure is

1198621

= 1198971

or not1198975

1198622

= not1198976

or 119901

1198623

= 1198975

or not119901

1198624

= 1198976

or 1198972

1198625

= not1198974

1198626

= 1198973

(29)

where

1198971 119886 = 119887

1198972 119886 = 119888

1198973 119887 = 119888

1198974 119891 (119886) = 119891 (119887)

1198975 119892 (119886) = 119892 (119887)

1198976 ℎ (119886) = ℎ (119888)

(30)

A possible CDPLL(T) procedure is shown in Table 3where the referred certificates are as follows

1198751

= [1198750

Lemma (not1198971

or 1198974) = 1198627

]

1198752

= [

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

]

]

1198753

=

[[[

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

Res 1198628 1198627 1198624 1198626 1198622 1198623 1198621 = ◻

]]]

]

(31)

In step 4 119872 becomes T-inconsistent The generalizedimplication graph is shown in Figure 4 A theory lemma119862

7=

not1198971

or 1198974is learned (instantiated the monotonicity property of

the equality) firstly then the backjump rule is applicable Insteps 11 and 12 119872 becomes T-inconsistent again A theorylemma 119862

8= 1198972

and 1198973

rarr 1198971is then learned (transitivity

of equality) and then clause learning is performed whichlearns the empty clause The implication graph is shown inFigure 5

5 The Certificate Checking

Lemma 15 For any initial clause set 1198650 given a well-formed

certificate 119875 and 0 le 119894 le |119875| if 119875[119894] is a certificate item thatdefines a clause then 119865

0⊨T⟦119875[119894]⟧

Proof We prove that by induction The base case is easy toprove because for any initial item 119875[119894] that defines a clause119875[119894] isin 119865

0 Therefore it is trivial that 119865

0⊨ ⟦119875[119894]⟧ Thus

1198650⊨T⟦119875[119894]⟧

(i) For resolution certificate items assume that 119875[119894] =

Res1198621 1198622 119862

119899 By definition all 119862

119894are all

defined in lfloor119875rfloor119894minus1

Then by the inductive hypothesis

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 9

Table 2 The CDPLL(T) procedure for Example 1

No 119872 119865 oplusP Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 119897

d1

10038171003817100381710038171003817sim oplus sim 0 Decide

3 119897d1 1198972

10038171003817100381710038171003817sim oplus sim 119897

2997891rarr 1198623 UnitPropagate

4 119897d1 1198972 not1198973

10038171003817100381710038171003817sim oplus sim

1198972

997891rarr 1198623

not1198973

997891rarr 1198621

UnitPropagate

now 1198622is falsified

5 not1198971

1003817100381710038171003817

1198650

1198626

= not1198971

oplus [1198750

Res1198621 1198622 1198623 = 119862

6

] not1198971

997891rarr 1198626

1198626

= not1198971is learned

T-Backjump

6 not1198971 not1198972

1003817100381710038171003817 sim oplus simnot1198971

997891rarr 1198626

not1198972

997891rarr 1198624

UnitPropagate

7 ⟨Fail⟩ sim oplus

[[[

[

1198750

Res1198621 1198622 1198623 = 119862

6

Res1198624 1198625 1198626 = ◻

]]]

]

0 Fail

Example 2 Consider the background theory TEUF and aclause set 119865 containing the following

(1198621) 119886 = 119887 or 119892 (119886) = 119892 (119887)

(1198622) ℎ (119886) = ℎ (119888) or 119901

(1198623) 119892 (119886) = 119892 (119887) or not119901

(1198624) ℎ (119886) = ℎ (119888) or 119886 = 119888

(1198625) 119891 (119886) = 119891 (119887)

(1198626) 119887 = 119888

(28)

Its boolean structure is

1198621

= 1198971

or not1198975

1198622

= not1198976

or 119901

1198623

= 1198975

or not119901

1198624

= 1198976

or 1198972

1198625

= not1198974

1198626

= 1198973

(29)

where

1198971 119886 = 119887

1198972 119886 = 119888

1198973 119887 = 119888

1198974 119891 (119886) = 119891 (119887)

1198975 119892 (119886) = 119892 (119887)

1198976 ℎ (119886) = ℎ (119888)

(30)

A possible CDPLL(T) procedure is shown in Table 3where the referred certificates are as follows

1198751

= [1198750

Lemma (not1198971

or 1198974) = 1198627

]

1198752

= [

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

]

]

1198753

=

[[[

[

1198750

Lemma (not1198971

or 1198974) = 1198627

Lemma (1198972

and 1198973

997888rarr 1198971) = 1198628

Res 1198628 1198627 1198624 1198626 1198622 1198623 1198621 = ◻

]]]

]

(31)

In step 4 119872 becomes T-inconsistent The generalizedimplication graph is shown in Figure 4 A theory lemma119862

7=

not1198971

or 1198974is learned (instantiated the monotonicity property of

the equality) firstly then the backjump rule is applicable Insteps 11 and 12 119872 becomes T-inconsistent again A theorylemma 119862

8= 1198972

and 1198973

rarr 1198971is then learned (transitivity

of equality) and then clause learning is performed whichlearns the empty clause The implication graph is shown inFigure 5

5 The Certificate Checking

Lemma 15 For any initial clause set 1198650 given a well-formed

certificate 119875 and 0 le 119894 le |119875| if 119875[119894] is a certificate item thatdefines a clause then 119865

0⊨T⟦119875[119894]⟧

Proof We prove that by induction The base case is easy toprove because for any initial item 119875[119894] that defines a clause119875[119894] isin 119865

0 Therefore it is trivial that 119865

0⊨ ⟦119875[119894]⟧ Thus

1198650⊨T⟦119875[119894]⟧

(i) For resolution certificate items assume that 119875[119894] =

Res1198621 1198622 119862

119899 By definition all 119862

119894are all

defined in lfloor119875rfloor119894minus1

Then by the inductive hypothesis

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

10 Journal of Applied Mathematics

Table 3 A possible CDPLL(T) procedure

No 119872 119865 oplus119875 Reason Explanation1 0 119865

0oplus1198750

0 Initial state2 not119897

4

1003817100381710038171003817 sim oplus sim not1198974

997891rarr 1198625 UnitPropagation

3 not1198974 1198973

1003817100381710038171003817 sim oplus simnot1198974

997891rarr 1198625

1198973

997891rarr 1198626

UnitPropagation

4 not1198974 1198973 119897d1

10038171003817100381710038171003817sim oplus sim sim

Decide

119872 becomes T-inconsistent5 sim 119865

0cup 1198627 oplus119875

1sim T-Learn(II) 119862

7is learned

6 not1198974 1198973 not1198971

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

T-Backjump

7 not1198974 1198973 not1198971 not1198975

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

UnitPropagation

8 not1198974 1198973 not1198971 not1198975 not119901

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

UnitPropagation

9 not1198974 1198973 not1198971 not1198975 not119901 not119897

6

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

UnitPropagation

10 not1198974 1198973 not1198971 not1198975 not119901 not119897

6 1198972

1003817100381710038171003817 sim oplus sim

not1198974

997891rarr 1198625

1198973

997891rarr 1198626

not1198971

997891rarr 1198627

not1198975

997891rarr 1198621

not119901 997891rarr 1198623

not1198976

997891rarr 1198622

1198972

997891rarr 1198624

UnitPropagation

119872 becomes T-inconsistent

11 sim 1198650

cup 1198627 1198628 oplus119875

2sim T-Learn(II) 119862

8is learned

12 sim 1198650

cup 1198627 1198628 oplus119875

3sim Fail ◻ derived

we know 1198650

⊨T⟦119862119894⟧ By the property of resolution it

is true that 1198650

⊨T⟦119875[119894]⟧(ii) For theory lemma it is true that ⊨T⟦119875[119894]⟧ So

1198650⊨T⟦119875[119894]⟧

Theorem 16 Given a clause set 1198650 if a certificate 119875 for 119865

0is

well formed and ◻ isin lfloor119875rfloor then 1198650is unsatisfiable

Proof By Lemma 15 we know that 1198650⊨T◻ in this case

By Theorem 16 certificate checking is actually checkingwhether the certificate is well formed and contains theempty clause Checking the well formedness of a certifi-cate can directly follow Definition 7 Only one traverse

from the first item to the last one is needed Moreoverthe following properties make the checking process eveneasier

(i) Only checking for theory lemma is performed intheory T Once a theory lemma is proved valid it canbe treated as a propositional clause

(ii) For each individual proof rule we have a dedicatedchecker to check if the theory lemma is an instanceof the rule In this way the certificate checker isextensible Given a new proof rule we need only toadd the corresponding lemma checker Notice thateach proof rule defines an atomic deduction step thechecking effort is much less than solving a constraint

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 11

Table 4 Experiment results

Case Input CDPLL (T) Certificate CheckingLit Clause Learnt Time Lemmas Res Forget Mem Time

1 2 3 0 001 s 2 2 0 4 001 s2 10 15 1 001 s 4 8 3 10 000 s3 17 25 3 001 s 8 14 4 17 000 s4 66 95 1069 008 s 1024 550 490 66 005 s5 87 125 9537 071 s 8192 4146 4043 87 018 s6 1157 2611 5197 139 s 22383 3711 2635 1051 023 s7 1582 2173 16735 2130 s 859886 8529 7438 1148 116 s8 811 1037 5164 580 s 248159 4286 3580 555 041 s9 4687 9433 759 100 s 1021 751 430 617 003 s10 167 308 60 004 s 934 77 48 79 002 s11 447 883 1464 053 s 15894 965 995 399 009 s12 3930 6584 3046 222 s 8555 2468 1617 1639 014 s13 3362 5154 2310 168 s 6204 1916 1251 1467 013 s14 4615 7036 3244 371 s 9521 2308 1633 1776 013 s15 2422 3726 828 043 s 587 616 509 428 003 s16 5226 8614 9781 785 s 6305 3457 1723 2199 012 s17 3073 4821 128729 8765 s 74289 85711 84269 5899 516 s18 2415 3710 2511 129 s 2165 1781 1377 1367 027 s19 4151 6492 32991 2126 s 26011 21015 20792 4497 124 s

1198626

1198625

1198971

1198974

not1198974

119886 = 119887 rarr 119891(119886) = 119891(119887)

1198973

Figure 4 The implication graph in step 4

in the theory T For instance only pattern matchingis needed to check if a theory lemma is an instance ofthe monotonicity property of equality

6 Experiments

Two criteria are adopted to assess our certificate approachthe overhead for generating certificates and the cost forcertificate checking We implemented an SMT solver aCiNObased on the CDPLL(T) algorithm It uses the Nelson-Oppenframework to solve combined theory Currently TEUF and

1198626

1198625

1198627

not1198974

1198973

not1198971

1198621

not1198975 not119897611986231198622

1198624

not119875 1198972

1198971

Figure 5 The implication graph in steps 11 and 12

TLRA are considered The experiments are carried out on amachinewith a E7200 dual core CPU (253GHz per core) and20GB RAM

All test cases are taken from SMT-LIB [22] Our approachis tested on 19 unsatisfiable instances from different folders(in order to test it on different kinds of problem instances)Experiment results are shown in Table 4

In Table 4 the first column-group describes the scale ofthe input problem The input is transformed to its equivalentconjunctive normal form and the number of literals andclauses are counted In the CDPLL(T) procedure once aclosed branch is encountered a new clause will be learnedThe number of learned clauses is listed in the second column-group This number is equal to that in DPLL(T) sincecertificate generation will not affect the search procedure ofthe SMT problem As in the DPLL(T) algorithm the number

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

12 Journal of Applied Mathematics

Table 5 Overhead of our approach

Test Case With cert (sec) Without cert (sec) Overhead ()NEQNEQ004 size4smt2 063 060 462NEQNEQ004 size5smt2 1987 1827 874PEQPEQ002 size5smt2 529 485 907PEQPEQ020 size5smt2 129 120 747QG-classificationloops6dead dnd001smt2 134 120 1194QG-classificationloops6gensys brn004smt2 106 095 1181QG-classificationloops6gensys icl001smt2 259 222 1662QG-classificationloops6iso brn005smt2 018 014 2733QG-classificationloops6iso icl030smt2 332 305 868QG-classificationqg5dead dnd007smt2 3562 3455 309QG-classificationqg5gensys brn007smt2 074 071 452QG-classificationqg5gensys icl100smt2 1429 1360 507SEQSEQ004 size5smt2 068 054 2693SEQSEQ050 size3smt2 019 018 703

of clause learning can be quite large (eg the 17th case) Thetime used in CDPLL(T) is also presented The third column-group summarises the generated certificates It is obvious thatthe number of theory lemma items is usually very large Thecolumn of ldquoForgetrdquo is the number of forget items that forgetsan initial or resolution certificate item All theory lemmas areeventually forgotten these items are not counted here ForSMT problem this is not surprising since the major work ofSMT solvers is in reasoning in the background theory Theresources and time required to check the certificates are listedin the Checking column-group All certificates are checked tobe well formedThe ldquoMemrdquo column is not the size of memoryconsumed but the maximal number of clause items that arestored in the memory during checking Also the time forcertificate checking is listed besides

Regarding the certificate checking the number of initialclauses andmemory consumption is compared in Figure 6 Itis rather interesting that although the certificate itself couldbe exponentially large with respect to the input problemthe memory consumption will not grow in the same wayThe reason is that we have explicitly forgotten many itemsIn particular many theory lemma are referred locally Theyare only available in a short period of time after whichthey are forgotten With the technique of forgetting itemsthe certificate checking becomes more efficient This is alsosupported by data from the ldquoTimerdquo column in Table 4

Towards the overhead of our approach the experimentis shown in Table 5 Tested cases are also from SMT-LIB Inorder to suppress the inaccurate measurement of time weintentionally selected time consuming cases (eg longer than001 sec) The maximal overhead is about 2733 and theaverage overhead is about 105 It ismuch smaller comparedto other approaches [17 21]

7 Conclusion

In this paper a unified certificate framework for quantifier-free SMT instances was presented The certificate format issimple clean and extensible to other background theories

10000

1000

100

10

1

Mem

Num

ber o

f ini

tial

Number of initialMem

100001000100101

Number of initialMem

Figure 6 Comparison between the number of initial clauses andmemory usage

The certificate generation procedure can be easily integratedto any DPLL(T)-based SMT solver Soundness and com-pleteness of the extension of DPLL(T) with the certificategeneration procedure were established Experimental resultsshow that our certificate framework outperforms others inboth certificate generation and certificate checking

Acknowledgments

This work was supported by the Chinese National 973 Planunder Grant no 2010CB328003 the NSF of China underGrants nos 61272001 60903030 and 91218302 the Chinese

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Journal of Applied Mathematics 13

National Key Technology RampD Program under Grant noSQ2012BAJY4052 the Tsinghua University Initiative Scien-tific Research Program

References

[1] C Barrett M Deters L de Moura A Oliveras and A Stumpldquo6 Years of SMT-COMPrdquo Journal of Automated Reasoning vol50 no 3 pp 243ndash277 2013

[2] C Barrett and C Tinelli ldquoCVC3rdquo in Proceedings of the 19thInternational Conference on Computer Aided Verification pp298ndash302 Springer July 2007

[3] L Moura and N Bjrner ldquoZ3 an efficient SMT solverrdquo in Toolsand Algorithms for the Construction and Analysis of Systems CRamakrishnan and J Rehof Eds vol 4963 of Lecture Notesin Computer Science pp 337ndash340 Springer Berlin Germany2008

[4] P Beame H Kautz and A Sabharwal ldquoUnderstanding thepower of clause learningrdquo in Proceedings of the InternationalJoint Conference onArtificial Intelligence pp 1194ndash1201 CiteseerAcapulco Mexico August 2003

[5] J Silva ldquoAn overview of backtrack search satisfiability algo-rithmsrdquo in Proceedings of the 5th International Symposium onArtificial Intelligence and Mathematics Citeseer January 1998

[6] P Beame and T Pitassi ldquoPropositional proof complexity pastpresent and futurerdquo in Bulletin of the European Association forTheoretical Computer Science The Computational ComplexityColumn pp 66ndash89 1998

[7] S Cook ldquoThe complexity of theorem-proving proceduresrdquo inProceedings of the 3rd Annual ACM Symposium on Theory ofComputing pp 151ndash158 ACM ShakerHeights Ohio USA 1971

[8] M Davis G Logemann and D Loveland ldquoAmachine programfor theorem-provingrdquo Communications of the ACM vol 5 pp394ndash397 1962

[9] R Nieuwenhuis A Oliveras and C Tinelli ldquoSolving SAT andSAT modulo theories from an abstract davismdashputnammdashloge-mannmdashloveland procedure to DPLL(T)rdquo Journal of the ACMvol 53 no 6 Article ID 1217859 pp 937ndash977 2006

[10] M W Moskewicz C F Madigan Y Zhao L Zhang and SMalik ldquoChaff engineering an efficient SAT solverrdquo in Proceed-ings of the 38th Design Automation Conference pp 530ndash535June 2001

[11] N Een and N Sorensson ldquoAn extensible SAT-solverrdquo inTheoryand Applications of Satisfiability Testing pp 333ndash336 SpringerBerlin Germany 2004

[12] A Biere ldquoPicoSAT essentialsrdquo Journal on Satisfiability BooleanModeling and Computation vol 4 article 45 2008

[13] M Boespug Q Carbonneaux and O Hermant ldquoThe 120582120587-calculusmodulo as a universal proof languagerdquo inProceedings ofthe 2nd International Workshop on Proof Exchange for TheoremProving (PxTP rsquo12) June 2012

[14] A Stump D Oe A Reynolds L Hadarean and C TinellildquoSMT proof checking using a logical frameworkrdquo Formal Meth-ods in System Design vol 42 no 1 pp 91ndash118

[15] D Deharbe P Fontaine B Paleo et al ldquoQuantifier inferencerules for SMT proofsrdquo 1st International Workshop on ProofeXchange for Theorem Proving (PxTP rsquo11) 2011

[16] P Fontaine J YMarion SMerz L Nieto andA Tiu ldquoExpress-iveness + automation + soundness towards combining SMTsolvers and interactive proof assistantsrdquo in Tools and Algorithms

for the Construction and Analysis of Systems H Hermanns andJ Palsberg Eds vol 3920 of Lecture Notes in Computer Sciencepp 167ndash181 Springer Berlin Germany 2006

[17] D Oe A Reynolds and A Stump ldquoFast and flexible proofchecking for SMTrdquo in Proceedings of the 7th InternationalWork-shop on Satifiability ModuloTheories (SMT rsquo09) pp 6ndash13 ACMAugust 2009

[18] A Cimatti A Griggio and R Sebastiani ldquoA simple and exibleway of computing small unsatisfiable cores in SAT modulotheoriesrdquo inTheory andApplications of Satisfiability Testing SATJ Marques-Silva and K Sakallah Eds vol 4501 of Lecture Notesin Computer Science pp 334ndash339 Springer Berlin Germany2007

[19] Y Ge and C Barrett ldquoProof translation and SMT-LIB bench-mark certification a preliminary reportrdquo in Proceedings ofInternational Workshop on Satisfiability Modulo Theories (SMTrsquo08) August 2008

[20] S Bohme ldquoProof reconstruction for Z3 in IsabelleHOLrdquo inProceedings of the 7th International Workshop on SatisfiabilityModulo Theories (SMT rsquo9) August 2009

[21] M Moskal ldquoRocket-fast proof checking for SMT solversrdquo inTools and Algorithms For the Construction and Analysis of Sys-tems C Ramakrishnan and J Rehof Eds vol 4963 of LectureNotes in Computer Science pp 486ndash500 Springer BerlinGermany 2008

[22] C Barrett A Stump and C Tinelli ldquoThe SMT-LIB standardversion 20rdquo in Proceedings of the 8th InternationalWorkshop onSatisfiability Modulo Theories Edinburgh UK 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of