ANL-90/9 ANL--90/9 - UNT Digital Library/67531/metadc283004/m2/1/high_re… · ANL-90/9 ANL--90/9...

34
Distribution Category: Mathematics and Computer Science (UC-405) ANL-90/9 ANL--90/9 DE9O 007978 ARGONNE NATIONAL LABORATORY 9700 South Cass Avenue Argonne, Illinois 60439-4801 OTTER 2.0 Users Guide by William W. McCune I Mathematics and Computer Science Division March 1990 0~ b. $ 0.0 ~V w~ 0 G 1 a ' , V g : E 9 'U ace w o y o u O" o ' . p U~e This work was supported by the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy, under Contract W-31-109-Eng-38. MASTER ?

Transcript of ANL-90/9 ANL--90/9 - UNT Digital Library/67531/metadc283004/m2/1/high_re… · ANL-90/9 ANL--90/9...

Distribution Category:

Mathematics and Computer

Science (UC-405)

ANL-90/9 ANL--90/9

DE9O 007978

ARGONNE NATIONAL LABORATORY9700 South Cass Avenue

Argonne, Illinois 60439-4801

OTTER 2.0 Users Guide

by

William W. McCune IMathematics and Computer Science Division

March 1990

0~

b. $ 0.0 ~V w~

0 G

1 a ' , V g

: E 9 'Uace w o y o

u O" o ' .p U~e

This work was supported by the Applied Mathematical Sciences subprogram of the Officeof Energy Research, U.S. Department of Energy, under Contract W-31-109-Eng-38.

MASTER ?

Contents

1 Introduction

1.1 Major Changes Since Version 1.0.. ...................

2 Outline of OTTER'S Inference Process

3 Using OTTER

3.1 Syntax..............................

3.1.1 Names . . . . . . . . . . . . . . . . . . . . . . . . .

3.1.2 Terms and Atoms . . . . . . . . . . . . . . . . . .

3.1.3 Literals and Clauses . . . . . . . . . . . . . . . . .

3.1.4 Formulas . . . . . . . . . . . . . . . . . . . . . . .

3.2 Commands and the Input File . . . . . . . . . . . . . . . .

3.2.1 Input of Options . . . . . . . . . . . . . . . . . . .

3.2.2 Input of Lists of Clauses . . . . . . . . . . . . . . .

3.2.3 Input of Lists of Formulas . . . . . . . . . . . . . .

3.2.4 Input of Lists of Weight Templates . . . . . . . . .

3.2.5 The Commands skolem, lex, lrpo.lr-status, and

4 Options

4.1 Flags . . . . . . . ......-. .. . .............

4.1.1 Main Loop Flags . . . . . . . . . . . . . . . . . . .

4.1.2 Inference Rules . . . . . . . . . . . . . . . . . . . .

4.1.3 Paramodulation Flags . . . . . . . . . . . . . . . .

4.1.4 Flags for Handling Generated Clauses . . . . . . .

4.1.5 Demodulation and Equality Flags . . . . . . . . .

4.1.6 Indexing Flags . . . ..-.... . . . . . . . . . . ..

4.1.7 Input and Output Flags . . . . . . . . . . . . . . .

4.1.8 Miscellaneous Flags . . . . ... . . . . . .. . . .

4.2 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . .

4.2.1 Monitoring Progress . . . . . . . . . . . . . . . . .

4.2.2 Placing Limits on the Search . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

lrpo..l .. st atus

iii

1

2

3

4

4

4

4

5

5

5

6

6

7

7

8

8

. 8

. 8

. 9

9

10

. 11

12

13

13

14

14

14

4.2.3 Limits on the Size of Generated Clauses . . . . . . . . . . . . . . . . 14

4.2.4 Indexing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2.5 Miscellaneous Parameters . . . . . . . . . . . . . . . . . . . . . . . . 15

5 Ordering and Dynamic Demodulation 15

5.1 lexsrpo Is Clear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5.1.1 Lexical Order (lexspo is clear) . . . . . . . . . . . . . . . . . . . . 16

5.1.2 Lex-dependent Demodulation (lexsrpo is clear). . . . . . . . . . . . 16

5.1.3 Orienting Equalities (lex..rpo is clear) . . . . . . . . . . . . . . . . . 17

5.1.4 Determining Dynamic Demodulators (lexsrpo is clear) . . . . . . . 17

5.2 lex.rpo Is Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5.2.1 Lexical Order (lexspo is set) . . . . . . . . . . . . . . . . . . . . . . 18

5.2.2 Lex-dependent Demodulation (lex.rpo is set) .. . . . . . . . . . . . 18

5.2.3 Orienting Equalities (lexspo is set) . . . . . . . . . . . . . . . . . . 18

5.2.4 Determining Dynamic Demodulators (lexsrpo is set) . . . . . . . . 19

5.3 Completion and Termination . . . . . . . . .. . . . . . . . . . . . . . . . . 19

6 Evaluable Functions and Predicates ($SUM, $LT, ... ) 19

7 Weighting 22

7.1 Weighing Clauses and Literals . . . . . . . . . . . . . . . . . . . . . . . . . 23

7.2 Weighing Atoms and Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

8 Answer Literals 24

9 Meta-experimental Features 24

9.1 Linked UR-Resolution .. . . . . . . .. . .. . . . . . . . . . . . . . . . . . 24

9.2 Atom (Literal) Demodulation . . . . . . . . . . . . . . . . . . . . . . . . . . 24

9.3 Conditional Demodulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

9.4 Another Demodulation Trick . . . . . . . . . . . . . . . . . . . . . . . . . . 25

9.5 Introducing New Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

9.6 Ancestor Subsumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

9.7 Reducing max-veight on the Fly . . . . . . . . . . . . . . . . . . . . . . . . 26

iv

10 Limits, Abnormal Ends (AHENDs), and Fixes 27

11 Summary of the Options and Their Defaults 27

References 29

OTTER 2.0 Users Guide

by

William W. McCune

Abstract

OTTER (Organized Techniques for Theorem-proving and Effective Research)is a resolution-style theorem-proving program for first-order logic with equal-

ity. OTTER includes the inference rules binary resolution, hyperresolution, UR-

resolution, and binary paramodulation. Some of its other abilities are conversionfrom first-order formulas to clauses, forward and back subsumption, factoring,weighting, answer literals, term ordering, forward and back demodulation, evalu-

able functions and predicates, and Knuth-Bendix completion. OTTER is coded

in C, it is free, and it is portable to many different kinds of computer.

1 Introduction

OTTER (Organized Techniques for Theorem-proving and Effective Research) is a resolution-style theorem prover, similar in scope and purpose to the AURA [1.2] and LMA/ITP [9] theoremprovers, which are also associated with Argonne. The primary design considerations have

been performance, portability, and compactness and simplicity of the code. The program-

ming language C is used.

OTTER features the inference rules binary resolution, hyperresolution, UR-resolution,and binary paramodulation. These inference rules take a small set of clauses and infer a

clause; if the inferred clause is new, interesting, and useful, it is stored and may becomeavailable for subsequent inferences.

Other features of OTTER are the following:

" Statements of the problem may be input either with first-order formulas or with

clauses (a clause is a disjunction with implicit universal quantifiers and no existential

quantifiers). If first-order formulas are input, OTTER translates them to clauses.

1

" Forward demodulation rewrites and simplifies newly inferred clauses with a set of

equalities, and back demodulation uses a newly inferred equality (which has been

added to the set of demodulators) to rewrite all existing clauses.

" Forward subsumption deletes an inferred clause if it is subsumed by any existingclause, and back subsumption deletes all clauses that are subsumed by an inferred

clause.

" A variant of the Knuth-Bendix method can search for a complete set of reductions.

* Weight functions and lexical ordering decide the "goodness" of clauses and terms.

* Answer literals give information about the proofs that are found.

* Evaluable functions and predicates build in integer arithmetic, Boolean operations,and lexical comparisons, and enable users to "program" aspects of deduction processes.

OTTER is not automatic. Even after the user has encoded a problem into first-order logicor into clauses, the user must choose inference rules, set options to control the processingof inferred clauses, and decide which input formulas or clauses are to be in the initial setof support and which (if any) equalities are to be demodulators. If OTTER fails to find aproof, the user may wish to try again with different initial conditions.

There have been two previous releases of OTTER-version 0.9 was distributed at CADE-9in May 1988, and version 1.0 was released in January 1989. Summaries of other theorem.proving systems can be found in the proceedings of recent CADE meetings [11, 8].

It is assumed that the reader knows the terminology of first-order logic and automatedtheorem proving, including term (variable, constant, complex term), atom, literal, clause,propositional variable, function symbol, predicate symbol, Skolem constant, Skolem function,formula, and conjunctive normal form (CNF). See [15], [1], [7], or [14] for an introductionto automated theorem proving, and see [13] for an overview of the field.

1.1 Major Changes Since Version 1.0

1. OTTER can now use a termination ordering (recursive path ordering with status) toensure that demodulation terminates. See Section 5.2.

2. Users now have the option of using Prolog-style (upper-case) variables. See Section4.1.7.

3. New options can be used to restrict paramodulation from and/or into unit clauses andto prevent paramodulation into subterms of Skolem expressions. See Section 4.1.3.

4. Other paramodulation options and defaults have changed. See Section 4.1.3.

5. New options control selection of the given clause. See Section 4.1.1.

6. Atoms as well as terms can be rewritten by demodulation. See Section 9.2.

2

2 Outline of OTTER's Inference Process

Like AURA and LMA/ITP, OTTER uses the given-clause algorithm, which can be viewed asa simple implementation of the set of support strategy. OTTER maintains three lists of

clauses: axioms, sos (set of support), and demodulators. (AURA and LMA/ITP have a list

called have-been-given; OTTER appends clauses that have been given to axioms rather thankeeping them in a separate list. The name axioms is a bit misleading, because inferred

clauses become members of axioms-the name has been retained by evolution.)

The main loop for inferring and processing clauses and searching for a refutation is

While (sos is not empty and no refutation has been found)1. Let given-clause be the 'lightest' clause in sos;2. Move given.clause from sos to axioms;

3. Infer and process new clauses using the inference rules ineffect; each new clause must have the given.clause asone of its parents and members of axioms as its otherparents; new clauses that pass the retention testsare appended to sos;

End of while loop.

The procedure for processing a newly inferred clause newvcl is

1. (optional) Output newcl.2. Demodulate newcl (including $ evaluation).3. (optional) Orient equalities.

4. Merge identical literals (leftmost copy is kept).5. (optional) Sort literals.6. (optional) Discard nevcl and exit if new.cl has too many literals.7. Discard new.cl and exit if newcl is a tautology.8. (optional) Discard new.cl and exit if newcl is too 'heavy'.9. (optional) Discard newcl and exit if new.cl is subsumed by any clause

in axioms or sos (forward subsumption).10. (optional) Apply unit deletion.11. Integrate new.cl and append it to sos.12. (optional) Output kept clause.13. If newcl has 0 literals, a refutation has been found.14. If newcl has I literal, then search axioms and sos for

unit conflict (refutation) with new.cl.15. (optional) Print the proof if a refutation has been found.16. (optional) Try to make newcl into a demodulator.

17. (optional) Back demodulate if Step 16 made new.cl into a demodulator.18. (optional) Discard each clause in axioms and each clause in sos that

is subsumed by new.cl (back subsumption).19. (optional) Factor new.cl and process factors.

Steps 17-19 are delayed until steps 1-16 have been applied to all clauses inferred fromthe current given clause.

3

3 Using OTTER

OTTER is not interactive. On UNIX and on UNIX-like systems it reads from the standard

input and writes to the standard output:

otter < input.file > output.file

3.1 Syntax

Comments can be placed in the input file by using the symbol 1. All characters from the

first X on a line to the end of the line are ignored. Comments can occur within terms.

Comments are not echoed to the output file.

3.1.1 Names

Names are alphanumeric strings that may contain some other characters such as $ andA name may contain up to 50 characters. Names are used as constant symbols, functionsymbols, predicate symbols, propositional variables, and regular variables. In general, thetype (predicate symbol, function symbol, constant, variable) of a name is determined by itscontext. Since the variables in clauses are not explicitly bound by universal quantifiers, aconvention must be used to distinguish constants from variables. The rule is that in clauses,variables start with (lower-case) u, v, v, x, y, or z. (The option prolog..style-variablessays that variables start with upper-case letters. See Section 4.1.7.) In formulas, any namecan be used as a variable, because variables are explicitly quantified.

A name usually cannot be used for two different purposes. For example, an input errorwill be flagged if a symbol has different occurrences with different numbers of arguments.(This protective feature can be overridden by the command clear(check.arity). (SeeSection 4.1.7.)

Some names are special. Any binary predicate symbol that starts with EQ, Eq, or eq isunderstood by demodulation and paramodulation as an equality predicate. The symbol-=can be used to write infix equality atoms. All symbols that start with $ are reserved forspecial purposes. Any predicate symbol that starts with $ANS, $Ans, or $ans is understoodas an answer predicate (answer literal, Section 8). Other symbols that start with $ areevaluable functions or predicates (Section 6).

3.1.2 Terms and Atoms

Determining whether a simple term is a constant or a variable depends on the context ofthe term. If it occurs in a clause, then the name determines the type (see above). If itoccurs in a formula, it is a variable if it is bound by a quantifier. Most complex terms arewritten in prefix form, for example, f (a,b,c).

Prolog-style list notation can be used to write terms that represent lists: the symbol Cis an abbreviation for $nil, [t 1It2] an abbreviation for $cons (t1 ,t2 ), and 1t4,t2 ,t3,t4 ] an

4

abbreviation for $cons (ti , $cons(t2 , $cons (t, $cons(t4 , $nil)))). The notation [t1 ,t2 It 3 ]is not allowed-such an expression must be written [t1 I [t2 It3 ]].

White space (spaces, tabs, newlines) can occur in complex terms anywhere except withinnames and between a function or predicate symbol and the opening parenthesis.

Atoms are similar to complex terms, except that a name is also an atom (a propositionalvariable), and equalities and negated equalities can be written in infix form as (ti = t2 )and (t1 ! z=t 2 ). White space is required around = and !., and parentheses are required.

3.1.3 Literals and Clauses

If a is an atom, then a and -a are literals. There should be no white space between thenegation sign and the atom. A clause is a sequence of literals separated with I. Whiteapace is optional before and after literals. A clause is always terminated with a period (butthe period is not considered to be part of the clause).

3.1.4 Formulas

1. Atoms are formulas.

2. If F and G are formulas, then (F <-> G) and (F -> G) are formulas.

3. If F1 ,. .. , F, are formulas, then (F1 I --.- I F) and (F1 t --.- & F,) are formulas.

4. The symbols all and exists are quantifiers. If Qi ... Q, are quantifiers, x 1 . ..2n arenames, and F is a formula, then (Q1 xi -"-. Q xz F) is a formula.

5. If F is a nonnegated formula, then -F is a formula.

The symbols have their expected meanings: - means "not", <-> means "if and only if", ->means "implies", I means "or", and & means "and".

All parentheses are required, and white space is required around <->, ->, I, and &, andafter quantifiers and their associated variable occurrences.

Note that the following are not formulas: -- p(a) (double negation), (p & q -> r) (notenough parentheses), (all x p(x) & q(x)) (not enough parentheses), (p&q) (not enoughwhite space).

Clauses are different from formulas, both in syntax and in treatment by OTTER. Thestring "p I q I r" is a clause, and "(p I q I r)" is a formula. Formulas are translatedinto clauses (negation normal form, Skolemization, then CNF) when input.

3.2 Commands and the Input File

Input to OTTER consists of a small set of commands, some of which indicate that a list ofobjects (clauses, formulas, or weight templates) follows the command. All lists of objectsare terminated with end-of.list. The commands are given in Table 1.

5

set (flag.name) .

clear(flag-name).ass ign(parameter-name, integer).list (axioms)list (sos).list (demodulators).formula-list(axioms).formula.list(sos).

weight..list (weight-listLname) .

skolem(symboLlist).lex(symboLlist).1rpo..lr.status(symboLlist).

lrpo..rl..status(symboLlist).

.

..

.

y.

Si

'/.

'!

x

x

x

set a f lagclear a flagassign an integer to a parameter

read axioms in clause formread set of support in clause form

read demodulators in clause formread axioms in formula formread set of support in formula formread weight templatesidentify skolem functionsassign an ordering on symbols

specify RPO statusspecify RPO status

Table 1: Commands

There are a few constraints on the order of commands.

" Options that affect input and output, such as check-arity, prolog..style-variables,

simply.fol, and bird-print, should occur before any clause lists or weight lists.

" If the command set (lex..rpo) is present, it should occur before any demodulators.

3.2.1 Input of Options

OTTER recognizes two kinds of options: flags and parameters. Flags are Boolean-valuedoptions; they are changed with the set and the clear commands, which take the nameof the flag as the argument. Parameters are integer-valued options; they are changed withthe assign command, which takes the name of the parameter as the first argument and aninteger as the second. Examples are

set(binary.res).clear(back.sub).assign(max.seconds, 300).

% switch on binary resolution% do not use back subsumption

X stop after about 300 CPU seconds

The options are described and their default values are given in Section 4.

3.2.2 Input of Lists of Clauses

A list of clauses is specified with one of the following and is terminated with end..of..list.Each clause is terminated with a period.

list (axioms).list(sos).list (demodulators).

Example:

6

list (axioms)(x = x).(f(e,x) = x).

(f(g(x),x) = ).(f(f(x,y),z) = f(x,f(y,z))) .(f (z,x) ! f(z,y)) I (x = y).(f(x,z) != f(y,z)) I (x = y).

end.of.list.

% reflexivity% left identity% left inverse

% associativity

% left cancellation

% right cancellation

3.2.3 Input of Lists of Formulas

A list of formulas is specified with one of the following and is terminated with end-of-list.

Each formula is terminated with a period.

formula.list (axioms).formula.list(sos).

Example (equivalent to above):

formula.list(axioms).(all a (a = a)).(all a (f(e,a) = a)).(all a (f(g(a),a) = e)).(all a all b all c (f(f(ab),c) = f(a,f(b,c)))).(all a all b all c ((f(c,a) = f(cb)) -> (a = b))).(all a all b all c ((f (a, c) = f (b, c)) -> (a * b))).

end-of.list .

% reflexivity% left identity% left inverse% associativity% left cancellation% right cancellation

3.2.4 Input of Lists of Weight Templates

A list of weight templates is specified with one of the following and is terminated withend-of-list. Each weight template is terminated with a period.

weight.list(pick.given).weight.list(purge..gen).weight .list (pick.and-purge).weight..list (terms).

Example:

weight.list (pick.and.purge).weight(a, 0).

weight(g(2), -50).weight(P(1,1), 100).weight(x, 5).weight(f(g(-3), 4), -300).

end.of.list .

xxxx

V.xI

V..1K

for picking given clausesfor discarding generated clausesfor both picking and purgingfor ordering terms

weight of constant a is 0

twice weight of argument - 50

sum of weights of arguments + 100all variables have weight 5see Section 7

7

See Section 7 for the syntax and use of weight templates.

3.2.5 The Commands skolem, lex, lrpo..lrstatus, and lrpo..rl-status

Each of the commands skolem, lex, lrpo.lr.status, and lrpo.xLstatus takes a list ofterms as an argument. The lex command specifies an ordering on symbols, and the othersgive properties to symbols. An example is

le( [a, b, f(x,x), d, g(x), c] ).

The arguments of f and g serve as place-holders only; they identify f and g as function or

predicate symbols and specify the arity.

skolem([...)). The skolem command identifies constant and function symbols as Skolemsymbols. (If the user inputs quantified formulas and OTTER Skolemizes, this commandis not necessary.) The Skolem property is used by the options para..skip..skolem(Section 4.1.3) and deleteidentical..nested..skolem (Section 4.1.4).

lox ([...)). The lex command specifies an ordering on function and constant symbols.Lexical ordering on terms is used in three contexts: orienting equality literals (Sections5.1.3 and 5.2.3), deciding whether to appi, a lex-dependent demodulator (Sections5.1.2 and 5.2.2), and evaluating functions/predicates that perform lexical comparisons(Section 6).

lrpo.lr-st atus([...]) and lrpo..rl-status(E...]). These commands specify status withrespect to the recursive path ordering. See Section 5.2.

4 Options

Flags are Boolean-valued options, and parameters are integer-valued options. When theuser changes an option, OTTER sometimes automatically changes other options-the useris informed when such a change occurs.

4.1 Flags

4.1.1 Main Loop Flags

A given clause is taken from sos at the beginning of each iteration of the main loop. Thedefault is to take the lightest clause with respect to either weight.list (pick..given) orweight-list (pick.and..purge). If neither weight list is present, the weight of a clause isits number of symbols. Both lists cannot be present. See Section 7.

input..sos.irst - default clear. If this flag is set, the input clauses in sos are chosen (inorder) as the first given clauses; then the lightest clauses are chosen.

sos.queue - default clear. If this flag is set, the first clause in sos becomes the givenclause (the set of support list operates as a queue).

8

sos.stack - default clear. If this flag is set, the last clause in sos becomes the given clause

(the set of support list operates as a stack).

print-.given - default set. If this flag is set, clauses are output when they become given

clauses.

4.1.2 Inference Rules

binary-res -- default clear. If this flag is set, use the inference rule binary resolution

(along with any other inference rules that are set) to generate new clauses.

hyper-res - default clear. if this flag is set, use the inference rule (positive) hyperresolution(along with any other inference rules that are set) to generate new clauses.

ursres - default clear. If this flag is set, use the inference rule UR-resolution (unit-resultingresolution) (along with any other inference rules that are set) to generate new clauses.

para..into - default clear. If this flag is set, use the inference rule "paramodulation intothe given clause" (along with any other inference rules that are set) to generate new clauses.

para.Irom - default clear. If this flag is set, use the inference rule "paramodulation fromthe given clause" (along with any other inference rules that are set) to generate new clauses.

demod-inf - default clear. If this flag is set, apply demodulation, as if it were an inferencerule, to the given clause. This is useful for debugging sets of demodulators. When this flagis set, the given clause is copied, then processed just like any newly generated clause.

4.1.3 Paramodulation Flags

para.f rom..left - default set. If this flag is set, allow paramodulation from the left sidesof equality literals. (Applies to both para.into and para.f rom inference rules.)

parasfromright - default set. J' this flag is set, allow paramodulation from the rightsides of equality literals. (Applies to both para.into and parafrom inference rules.)

para.into-left - default set. If this flag is set, then allow paramodulation into leftarguments of positive and negative equalities. (Applies to both para.into and para..frominference rules.)

para-intosright - default set. If this flag is set, allow paramodulation into right argu-ments of positive and negative equalities. This flag is one of the options to be cleared whensearching for a complete set of reductions. (Applies to both para~into and para-frominference rules.)

para.from.vars - default clear. If this flag is set, allow paramodulation from variables.(Applies to both para-into and para.from inference rules.)

paraintovars - default clear. If this flag is set, allow paramodulation into variables.(Applies to both parainto and para.f rom inference rules.)

parafromiunits.only - default clear. If this flag is set, paramodulate only if the fromclause is a unit (equality). (Applies to both parainto and para.flrom inference rules.)

9

para.into..units.only - default clear. If this flag is set, paramodulate only if the into

clause is a unit. (Applies to both parainto and para.rom inference rules.)

para..skip..skolem - default clear. If this flag is set, do not paramodulate into subterms

of Skolem expressions [10]. (Applies to both parainto and para.from inference rules.)

para..onessrule - default clear. If this flag is set, pararnodulation obeys the 1's rule. (The

l's rule is a special-purpose strategy for problems in combinatory logic-its usefulness has

not been demonstrated elsewhere.) (Applies to both para..into and para.from inference

rules.)

para-all - default clear. If this flag is set, then replace all occurrences of the into term

with the replacement term. (Applies to both para..into and para..rom inference rules.)

4.1.4 Flags for Handling Generated Clauses

(Section 4.1.5 gives additional, equality-related flags for handling generated clauses.)

very-verbose - default clear. If this flag is set, much information about the processing of

generated clauses is output.

order-eq - default clear. If this flag is set, flip equalities if the right side is heavier thanthe left. See Sections 5.1.3 and 5.2.3 for the meaning of "heavier".

sort-literals - default clear. If this flag is set, literals of newly generated clauses are

sorted: negative literals, then positive literals, then answer literals. The main purpose of

this flag is to make clauses more readable. In some cases, this flag can speed up subsumption

on non-unit clauses.

delete.identical-nestedskolem - default clear. If this flag is set, delete a clause ifa Skolem expression (properly) contains an occurrence of its leading Skolem symbol. Forexample, if f is a Skolem function, delete a clause if it has a term f (f (x)) or a term

f(g(f(x))).

for.sub - default set. If this flag is set, apply forward subsumption during the processingof newly generated clauses. (Delete the new clause if it is subsumed by any clause in axiomsor sos.)

unit-deletion default clear. If this flag is set, apply unit deletion to newly generatedclauses. Unit deletion removes a literal from a newly generated clause if the literal is the

negation of an instance of a unit clause that occurs in axioms or sos. For example, the

second literal of p(a,x) I q(a,x) is removed by the unit -q(u,v); but it is not removedby the unit -q(u,b), because that unification causes the instantiation of x. All such literals

are removed from the newly generated clause, even if the result is the empty clause. (Unit

deletion is not useful if units only are being generated.)

print..kept - default set. If this flag is set, output new clauses if they pass all retentiontests.

print..proofs - default set. If this flag is set, print all proofs that are found to the output

file. If this flag is clear, no proofs are printed to the output file.

back-sub - default set. If this flag is set, apply back subsumption during the processing

10

of newly kept clauses. (Delete all clauses in axioms or sos that are subsumed by the newly

kept clause.)

printbacksub - default set. If this flag is set, output clauses when they are back sub-

sumed.

f actor - default clear. If this flag is set, factor newly kept clauses. Note that unlike otherinference rules, factoring is not applied to the given clause-it is applied to a new clause assoon as it is kept. All factors are generated in an iterative manner. Factoring is attemptedon answer literals. If factoring is enabled, a clause with n literals will never subsume aclause with fewer than n literals.

4.1.5 Demodulation and Equality Flags

demod-istory - default set. If this flag is set, then when a clause is demodulated, includethe numbers of the demodulators in the derivation history of the clause.

demod-.linear - default clear. If this flag is set, disable demodulation indexing and usea linear search of demodulators when rewriting a term. With indexing disabled, if morethan one demodulator can be applied to rewrite a term, then the one that occurs first inthe input file is applied; this flag is useful when demodulation is used to do "procedural"things. With indexing enabled (the default), demodulation is much faster, but the order inwhich demodulators is applied is not under the control of the user.

demod..out.in - default clear. If this flag is set, demodulate terms outside-in, left-to-right. In other words, the program attempts to rewrite a term before rewriting (left-to-right) its subterms. The algorithm is "repeat {rewrite the left-most outer-most rewritableterm} until no more rewriting can be done or the limit is reached". (The effect is likea standard reduction in lambda-calculus or in combinatory logic.) If this flag is clear,terms are demodulated inside-out (all subterms are fully demodulated before attemptingto rewrite a term). The one exception when inside-out demodulation is in effect is theevaluable conditional term $IF(condition, then-value, else-value) (Section 6).

dynamic demod - default clear. If this flag is set, attempt to make some newly keptequalities into demodulators (Sections 5.1.4 and 5.2.4). Setting this flag automatically setsthe flag order-eq.

dynamic-demod.all - default clear. If this flag is set, attempt to make all newly keptequalities into demodulators (Section 5.1.4). Setting this flag automatically sets the flagdynamic4.dmod.

print.new.demod - default set. If this flag is set, print demodulators that are adjoinedduring the search (dynamic.demod).

back-demod - default clear. If this flag is set, back demodulate demodulators, axioms,and sos whenever a new demodulator is added. Back demodulation is delayed until theinference rules are finished generating clauses from the current given clause (delayed untilpost.process). Setting the back.demod flag automatically sets the flags order-eq anddynamic-demod. ( Warning: the order in which clauses are back demodulated is in effectnondeterministic-it may change from run to run.)

11

print.back-demod - default set. If this flag is set, print clauses before they are back

demodulated.

symbol-elim - default set. If this flag is set, then orient new demodulators, if possible, sothat function symbols (excluding constants) are eliminated. A demodulator can eliminate alloccurrences of a function symbol if the arguments on the left side are all different variables,and the function symbol of the left side does not occur in the right side. For example,the demodulators g(x) = f (x,x) and h(x,y) = f (x,f (y,f (g(x),g(y)))) eliminate alloccurrences of g and h, respectively.

knuth..bendix - default clear. If this flag is set, then OTTER will approximate a ver-sion of the Knuth-Bendix completion procedure. Setting the knuthbendix flag automat-ically causes the following flags to be altered, if necessary, as follows: set (parafrom),set(parainto), set(paralrom.left), clear (para.fromsright), set(parainto.left),clear(para.intoright), set (dynamicdemod..all), set (back.demod). See Section 5.3for an example. The user may wish to also set (lexrpo) (see next flag).

lex..po - default clear. If this flag is set, then use the lexicographic recursive path ordering(also called RPO with status) to compare terms. If this flag is clear, weight templates andlexicographic order are used. See Section 5.2.

dynamic-demod..lex-dep - default clear. If this flag is set, dynamic demodulators may belex-dependent or LRPO-dependent. See Sections 5.1.4 and 5.2.4.

lex..order..vars - default clear. This flag affects lex-dependent demodulation and theevaluable functions and predicates that perform lexical comparisons. If this flag is set, thenlexical ordering is a total order on terms; variables are lowest in the term order, with x -< y-< z -< u -< v -< v -< v6 -< v7 -< v8 -< .. If this flag is clear, then a variable is comparableonly to another occurrence of the same variable; it is not comparable to other variables or tononvariables. For example, $LLT(f (x) ,f(y)) evaluates to $T if and only if lexorder-varsis set. If lex-rpo is set, lex-order.vars has no effect on demodulation. See Section 5.1.1for more detail.

4.1.6 Indexing Flags

for..sib..fpa - default clear. If this flag is set, use FPA indexing for forward subsumption.If this flag is clear, use discrimination tree indexing for forward subsumption. This flag canbe set to decrease the amount of memory required by OTTER. Discrimination tree indexingcan require a lot of memory, but it is much faster than FPA indexing.

no.fapl - default clear. If this flag is set, do not index positive literals for unit conflictor back subsumption. This should be used only when no negative units will be generated(as with hyperresolution), back subsumption is disabled, and discrimination tree indexingis being used for forward subsumption. This option can save a little time and memory.

no.Ianl - default clear. If this flag is set, do not index negative literals for unit conflictor back subsumption. This should be used only when no positive units will be generated,back subsumption is disabled, and discrimination tree indexing is being used for forwardsubsumption. This option can save a little time and memory.

12

4.1.7 Input and Output Flags

check..arity - default set. If this flag is set, symbols must not have variable arities

(different numbers of arguments in different places in the input). For example, the term

p(a, a(b)) would not be allowed. (Constants have arity 0.) If this flag is clear, then variable

arities are permitted; in the preceding term, the two occurrences of a would be treated as

different symbols.

prolog-style..variables - default clear. If this flag is set, a name with no arguments in

a clause is a variable if and only if it starts with A through Z (upper case) or _.

process..input - default clear. If this flag is set, input axioms and sos clauses (includingclauses from formula input) are processed as if they had been generated by an inference

rule. The processing includes subsumption, demodulation, and back demodulation. (SeeSection 2, "procedure for processing newly inferred clause".)

simplify.fol - default clear. If this flag is set, then attempt some simplification whenconverting input first-order formulas into clauses. The simplification occurs after Skolemiza-

tion, during the CNF translation. (Future releases may attempt simplification of quantifiedformulas.)

bird..print - default clear. If this flag is set, output terms constructed with the bi-

nary function a in combinatory logic notation (without the function symbol a and leftassociated unless otherwise indicated). For example, the clause (a(aa((S,x) ,y) ,z) =

a(a(x,z),a(y,z))) is output as (S x y z = x z (y z)). At present, terms cannot be

input in combinatory logic notation.

4.1.8 Miscellaneous Flags

free-all..mem - default clear. If this flag is set, then at the end of the run, return allmemory to the memory managers. (This is used to ensure that no memory is being lost.)When this flag is set, the numbers in the "in use" column of the memory statistics should

all be close to 0. This flag is used primarily for system debugging.

atomwt..max.args - default clear. If this flag is set, the default weight of an atom (theweight if no template matches the atom) is 1 + the maximum of the weights of the argu-

ments. If this flag is clear, the default weight of an atom is 1 + the sum of the weights ofthe arguments.

termvt..nax-args - default clear. If this flag is set, the default weight of a term (the weightif no template matches the atom) is 1 + the maximum of the weights of the arguments. Ifthis flag is clear, the default weight of a term is 1 + the sum of the weights of the arguments.

print.lists-at..end - default clear. If this flag is set, then axioms, sos, and demodulatorsare printed at the end of the search.

really..delete..clauses - default clear. If this flag is clear, clauses that are deleted byback subsumption or back demodulation are not really removed from memory; they are

retained in a special place so that they can be printed if they occur in a proof. If thejob involves much back subsuniption or back demodulation and if memory conservation isimportant, these deleted clauses can be removed from memory by setting this flag (and any

13

proof containing such a clause will not be printed in full).

4.2 Parameters

Parameters are integer-valued options. In the descriptions that follow, n is the value of the

parameter, and MAXJNT s a large integer, usually the size of the largest normal integer on

the user's computer.

4.2.1 Monitoring Progress

report - default 0, range [0..MAXJNT]. If n is not 0, then output statistics approximatelyevery n ciIJ seconds. The time is not exact, because statistics will be output only after thecurrent given clause is finished. n should not be too small; n = 30 is a good start. Thisfeature can be used in conjunction with U NIX programs such as grep and awk to convenientlymonitor OTTER jobs.

4.2.2 Placing Limits on the Search

max..seconds - default 0, range [0..MAXJNT]. If n is not 0, then terminate the search afterabout n CPU seconds. The time is not exact, because OTTER will wait until the currentgiven clause is finished before stopping.

max.gen - default 0, range [0..MAXJNT]. If n is not 0, then terminate the search afterabout n clauses have been generated. The number is not exact, because OTTER will waituntil it is finished with the current given clause before stopping.

max..kept - default 0, range [0..MAXJNT]. If n is not 0, then terminate the search afterabout n clauses have been kept. The number is not exact, because OTTER will wait until itis finished with the current given clause before stopping.

max.given - default 0, range [0..MAXJNT]. If n is not 0, then terminate the search aftern given clauses have been used.

max..mem - default 0, range [0..MAXJNT]. If n is not 0, then OTTER will terminate the

search before more than n Kbytes have been dynamically allocated (malloc).

4.2.3 Limits on the Size of Generated Clauses

max-literals - default 0, range [0..MAX.JNT]. If n is not 0, then new clauses are discardedif they contain more than n literals.

max-veight - default 0, range [0..MAXJNT). If n is not 0, then new clauses are discarded

if their weight is more than n. The weight list purge..gen or the weight list pick-and.purgeis used to weigh clauses (both lists cannot be present; see Section 7).

14

4.2.4 Indexing Parameters

fpa.literals - default 3, range [0..8]. n is the FPA indexing depth for literals. (FPA literal

indexing is used for resolution inference rules, back subsumption, and unit conflict. It is

also used for forward subsumption if the flag for..sub.fpa is set.) If n = 0, indexing is bypredicate symbol only; if n = 1, indexing looks at the predicate symbol and the symbols

that are arguments of the literal, and so on. Greater indexing depth requires more memory,but it can be faster. Changing this parameter should never change the clauses that are

generated or kept.

fpa..terms - default 3, range [0..8]. n is the FPA indexing depth for terms. (FPA term

indexing is used for paramodulation inference rules and back demodulation.) If a = 0,indexing is by function symbol only; if n = 1, indexing looks at the function symbol and thesymbols that are arguments of the literal, and so on. Greater indexing depth requires more

memory, but it can be faster. Changing this parameter should never change the clauses

that are generated or kept.

4.2.5 Miscellaneous Parameters

demodlimit - default 100, range [0..MAXJNT]. If n is not 0, then n is the maximumnumber of rewrites that will be applied when demodulating a clause. The count includes$ symbol evaluation. If n is 0, there is no limit. A warning message is printed if OTTERattempts to exceed the limit.

max.proofs - default 1, range [0..MAXJNT]. If n = 1, OTTER will stop if it finds a proof.If n > 1, then OTTER will not stop when it has found the first proof; instead, it will try tokeep searching until it has found n proofs. (Some of the "different" proofs may in fact be

identical.) (Because forward subsumption occurs before unit conflict, a clause representinga truly different proof may be discarded by forward subsumption before unit conflict detectsthe proof.) If n = 0, OTTER will find as many proofs as it can.

neg.weight - default 0, range [-MAXJNT..MAXJNT]. n is the additional weight (positiveor negative) that is given to negated literals. Weight templates cannot be used to dothis, because the negation sign cannot occur in weight templates. (Atoms, not literals, areweighed with weight templates, Section 7.)

stats-level - default 2, range [0..3]. This is the level of detail of statistics printed inreports and at the end of the search. If n = 0, no statistics are output; if n = 1, a fewimportant statistics are output; if n = 2, most relevant statistics are output; and if n = 3,most relevant statistics and subsumption counts are output. This parameter does not affectthe speed of OTTER, because all statistics are always kept.

5 Ordering and Dynamic Demodulation

This section contains a more complete explanation of the options lex.order..vars, order-eq,symbol elim, dynamic-demod, dynamic-demod.all, lex..rpo, and dynamic..demod..lex..dep,and it gives all the rules-built in and optional-for orienting equality literals and deter-

15

mining dynamic demodulators. In this section, a and /3 always refer to the left and right

arguments, respectively, of the equality literal under consideration.

Through the years, we have accumulated a collection of ad hoc ordering techniques,which are presented in Section 5.1. However, simpler and more predictable behavior can

occur if the Lexicographic Recursive Path Ordering (LRPO, flag lexsrpo) is used. Section

5.1 applies if lexsrpo is clear, and Section 5.2 applies if lexspo is set.

The lex command, which applies to both sections, can be used to assign an orderingon symbols. The command

lex( [a, b, c, d, or(x,x)] ).

specifies a - b -< c -< d -< or (or is a binary function symbol). Behavior is undefined if

relevant symbols are omitted from the lex command.

5.1 lexsrpo Is Clear

5.1.1 Lexical Order (lex-rpo is clear)

The flag lex..order..vars controls lexical ordering of terms containing variables.

lex.order-vars is set: Variables are the lowest in the symbol ordering, with x -< y -< z -< u-< v -< v < v6 - v7 - v8 .< -.. Since the order on symbols is total (any two symbols

are comparable), the lexical order on terms is total (any two terms are comparable).

Note that applying a substitution to a pair of terms may change their relative order.

lex.order-vars is clear (the default): A variable is comparable only to itself; it is not com-parable to different variables or to nonvariable terms. If a -< b, then f (a,xy) -<f (b,y,x), but f (x,a,y) and f (y,b,x) are not comparable. The order on terms is

partial. Note that if t1 -< t2 , and if a is any substitution, then to -< t2 o.

Lexical ordering on terms is used in three contexts: deciding whether to apply a lex-

dependent demodulator (Section 5.1.2), evaluating functions/predicates that perform lexi-cal comparisons (Section 6), and orienting equality literals (Section 5.1.3). When orienting

equality literals, partial lexical ordering is used, even if the flag lex.order.vars is set.

5.1.2 Lex-dependent Demodulation (lex.xpo is clear)

Two terms are identical-except-variables if they are identical after replacing all occurrences of

variables with x. An input demodulator is lex-dependent if and only if a and /3 are identical-

except-variables. A dynamic demodulator is lex-dependent only if a and /3 are identical-

except-variables. (See Section 5.1.4 for determining lex-dependent dynamic demodulators.)

A lex-dependent demodulator applies to a term only if its application produces a lexically

smaller term. When checking "lexically smaller", the flag lex.order.vars is consulted.

In the presence of the lex command and the (lex-dependent) demodulators

16

lex( [a, b, c, d, or(x,x)] ).

list(demodulators).

(or(x,y) * or(y,x)).

(or(x,or(y,z)) = or(yor(x,z))).end-of.list.

the term or(or(d,b),or(a,c)) will be demodulated to or(a,or(b,or(c,d))) (in several

steps).

5.1.3 Orienting Equalities (lex..po is clear)

Orienting equality literals (positive and negative) except positive unit equalities.The arguments a and 0 are weighed (Section 7) by using weight...list (terms). If wt(a) <

wt(/3), the literal is flipped. If wt(a) = wt(#), then a and p are compared in the partial

lexical order (Section 5.1.1); if a -< 3, the literal is flipped.

Orienting positive unit equalities. More care is taken in orienting positive unit equal-

ities, because they may become dynamic demodulators. The procedure is the following:

1. If the symbol.elim flag is set and if the equality is a symbol-eliminating type (Section

4.1.5), then orient the equality in the ap, apriate direction and exit.

2. If one argument is a proper subterm of the other argument, then orient the equalityso that the subterm is the right argument and exit.

3. Proceed as in the preceding paragraph "Orienting equality literals ...". If the lexi-cal comparison shows that the two arguments arc incomparable, then if vars(a) 2vars(/) and vars(a) ; vars(/), the literal is flipped.

5.1.4 Determining Dynamic Demodulators (lexrpo is clear)

A dynamic demodulator is a demodulator that is inferred rather than input. If either ofthe flags dynamic.demod or dynamic~demod-all is set, OTTER will attempt to make some

or all inferred positive equality units into demodulators.

If either of the flags dynamic~demod or dynamic.demod..all is set, then the flag order..eqis automatically set. (Dynamic demodulators are decided when equalities are oriented,

before forward subsumption. An equality actually becomes a dynamic demodulator after

forward subsumption.) The procedure assumes that equalities have already been oriented.

1. If the flag symbol.elim is set and if it applies, the equality becomes a demodulator.

2. If / is a proper subterm of a, the equality becomes a demodulator.

3. If a and /3 are comparable, in particular, if wt(a) > wt(/) or (wt(a) = wt(3) anda p ),

17

(a) if dynamic.demodall is set, the equality becomes a demodulator;

(b) if dynamicdemodall is clear and if wt(#3) < 1, the equality becomes a demod-

ulator.

4. If dynamic-demodlex..dep is set, if a and 0 are incomparable, if they are identical-

except-variables (Section 5.1.2), and if vars(a) 2 vars(3), then the equality becomes

a lex-dependent demodulator.

5.2 lexrpo Is Set

5.2.1 Lexical Order (lcx.rpo is set)

To use lexicographic recursive path ordering (LRPO, or RPO with status) [2, 3. 5] the user

must assign an ordering on constant and function symbols. This is accomplished with the

lex command. (OTTER uses a total ordering. Other implementations of LRPO use partialorderings or dynamically changing orderings.) If a lex command is not present, OTTERassigns an ordering, which is usually ineffective.

The total ordering on symbols extends to a total ordering on ground terms. The

lexical aspect of LRPO is that certain specified function symbols can have their argu-ments compared left-to-right or right-to-left. This is accomplished with the commandslrposr..status and lrpo.rl..status. For example, if associativity is a demodulator,f (f (x.y),z) = f (x,f (y,z)), and if expressions in f are to be right associated, one canuse the command

lrpo.lr.status ( [f (x,x)] ).

LRPO comparison is used when orienting equality literals, deciding whether an equalityshould be a demodulator or an LRPO-dependent demodulator, and deciding whether to

apply an LRPO-dependent demodulator. LRPO comparison is never used when evaluatingthe functions/predicates that perform lexical comparison ($LLT, $LGT, etc.).

5.2.2 Lex-dependent Demodulation (lox..rpo is set)

The : otion of lex-dependent demodulator is replaced with that of LRPO-dependent de-modulator. An input demodulator becomes LRPO-dependent if neither argument is LRPO-less-than the other. An LRPO-dependent demodulator is allowed to rewrite a term if itsapplication results in an LRPO-less-than term.

5.2.3 Orienting Equalities (lex.rpo is set)

All equality literals (positive and negative) are oriented in the same way. If alpha is LRPO-less-than beta, the literal is flipped.

18

5.2.4 Determining Dynamic Demodulators (lex.rpo is set)

If the flag dynamicdemod is set, OTTER attempts to make all equalities into demodulators(dynamic.demod.all is ignored). If beta is LRPO-less-than alpha, the derived equality be-comes a demodulator. (Alpha is not LRPO-less-than alpha, because orienting has alreadyoccurred.) If dynamic-demod.lex-dep is set, neither argument is LRPO-less-than the other,and every variable that occurs in beta also occurs in alpha, the derived equality becomesan LRPO-dependent demodulator.

5.3 Completion and Termination

LRPO enables an implementation of the Knuth-Bendix completion procedure [6]. Here isan input file that causes OTTER to search for and quickly find a complete set of reductionsfor free groups. (As with any application of the Knuth-Bendix completion procedure, thecritical issue is the choice of ordering scheme and/or the specific ordering on symbols.)

set (knuth.bendix).set (lex.rpo).

set (process.input).

set(print.lists.at.end).

lex([e,f(x,x).g(x)]).lrpo.lr.status([f (x,x)]) .

list(sos).

(x = x).

(i(ex) = x). '. left identity(f(g(x),x) * .). X left inverse(f(f(x,y),z) = f(x,f(y,z))). associativityend-of-list.

6 Evaluable Functions and Predicates ($SUM, $LT, ... )

OTTER, like AURA and ITP, recognizes some special function and predicate symbols as evalu-able symbols. Integer arithmetic, lexical comparison, Boolean evaluation, and conditionalexpressions can be employed when a user wishes to "program" some aspect of a theorem-proving task. (The speed of $ evaluation is not outstanding-it may be improved in futurereleases.)

Evaluation occurs during demodulation and during hyperresolution. If, for example,demodulation encounters a term $SUM(ii, i2 ), where i1 and i2 are integers, the term isrewritten to i3 , the sum of i1 and i2 , as if the demodulator ($SUM(ii, i2 ) = i3 ) werepresent. If, for example, hypeiresolution encounters the negative literal -$LT(ti, t2 ), thenti and t2 are demodulated; if the results are (respectively) integers ii and i2 , with i1 < i2 ,then the literal is removed as if the unit clause $LT(ti, t2 ) were present.

19

The symbols that evaluate to type Boolean can occur as either function symbols (de-

modulation) or predicate symbols (demodulation and hyperresolution). If they are used as

function symbols, the Boolean constants are $T (true) and $F (false).

int x int -+ int $SUM, $PROD, $DIFF, $DIV, $MODint x int -+ boo! $EQ, $NE, $LT, $LE, $GT, $GE

term x term -+ boo! $ID, $LNE, $LLT, $LLE, $LGT, $LGEboo! x boo! -+ boo! $AND, $ORboo! -+ boo! $TRUE, $NOT

-+ boo! $T, $F

term -+ boo! $ATOMIC, $NUMBER, $VAR-+ int $NEXT.CL-NUM

boo! X term X term -+ term $IF

Table 2: Evaluable Functions and Predicates

Table 2 contains all of the evaluable functions and predicates. Their behavior is the

following:

1. int x int -+ int. The term evaluates if both arguments demodulate to integers. $DIV

is integer division, and $MOD is remainder.

2. int x int -+ boo!. The term evaluates if both arguments demodulate to integers.

3. term x term -+ boo!. The term always evaluates. These operations are similar to thefive operations in int x int - boo!, except that the comparisons are lexical instead of

arithmetic. The lexical comparison is the same as in lex-dependent demodulation; in

particular, the flag lex..order-vars (Sections 4.1.5 and 5.1.1) has effect.

4. boo! x boo! - boo!. The term evaluates if both arguments demodulate to Booleans.(This is more restrictive than need be; for example, $AND($F ,bird) does not evaluate.)

5. boo! - boo!. The term evaluates if its argument demodulates to Boolean.

6. -+ boo!. If hyperresolution encounters a literal -$T or a literal $F, the literal is removed.If hyperresolution encounters a literal -$F or a literal $T, the entire hyperresolvent isdiscarded (because it is a tautology).

7. term - boo!. A term is $ATOMIC if it is a constant (including integer), a term is a$NUMBER if it is an integer, and a term is a $VAR if it is a (unbound) variable.

8. -+ int. The term $NEXT.CLJNUM (no arguments) evaluates to the next integer that willbe assigned as a clause identifier (this is useful for placing the ID of a clause withinthe clause).

9. boo! x term x term - term. The $IF function is the if-then-else operator. It is

described in the following paragraph.

When inside-out (the default) demodulation encounters a term $IF(condition, t1 , t2 ),demodulation deviates from its inside-out behavior. The term condition is demodulated

(evaluated); if the result is $T, the value of the $IF term is the result of demodulating t1 ;

20

if the result is $F, the value of the $IF term is the result of demodulating t2; if the resultis neither $T nor $F, demodulation returns to its normal behavior. Note that if conditionevaluates to a Boolean value, demodulation strays from its inside-out behavior, because justone of ti and t2 is demodulated. If the outside-in demodulation option has been set, thereis no need to treat $IF terms differently from the norm, because outside-in demodulationcauses the $IF term to be evaluated before either t1 or t 2 .

The evaluable functions and predicates enable the use of equalities with demodulationas a general-purpose equational programming language. Here are some example functions.

(gcd(x,y) = % greatest common divisor$IF($EQ(x,0).

y,$IF(*EQ(y,0),

X,

$IF($LT(xy),gcd(x,$DIFF(yx)),gcd(y,$DIFF(x,y)))))).

(meaber(Z, O) = $F). % some list functions (prolog.style.variables)(member (,[HIT]) = *IF(*ID(XH),

$T,meaber(X,T))).

(reverse(L) rev2(L, O)).(rev2(O, L) = L).(rev2([HIT],L) = rev2(T,[HILJ)).

A Boolean function defined with demodulators, such as member in the preceding set ofdefinitions, can be used as an antecedent (negated literal) in a hyperresolution nucleus inthe following way:

-L 1 I . .. I -$TRUE(member(element, list)) I -" - -I -L~ I M.

Evaluable functions and predicates are useful when using hyperresolution to performstate-space searches. An example is the Missionaries and Cannibals puzzle:

There are 3 missionaries, 3 cannibals, and a boat on the west bank of ariver. All wish to cross, but the boat holds at most 2 people. If the cannibalsever outnumber the missionaries on either bank of the river or in the boat, theoutnumbered missionaries will be eaten. Can they all safely cross the river? Ifso, how? (The boat cannot cross empty.)

[start of input file]

% State(I,Y,Z) means that I missionaries, Y cannibals,V. and the boat are on the Z side of the river.

set(hyper-res).

21

list(axioms).

-State(xmbs,xcbs,xbp)

-pick(xm)-pick(xc)-$LE(xm, xmbs)-$LE(xc, xcbs)-$GT($S'JN(xm, xc), 0)-$LE($SUM(xm, xc), 2)-$OR($GE(xm, xc),$EQ(xm,0))

It we have a provable state,

. missionaries to cross% cannibals to cross

. it number in boat > 0,% it number in boat <= 2,y it no feast in the boat,

. if no feast after the boat leaves current side,

I -$0R($GE($DIFF(xmbs, xm), $DIFF(xcbs, xc)),$EQ($DIFT'(xmbs, xm),O))

. if no feast when the boat arrives at the other ,ide,

I -$OR($GE($SUM($DIFF(3, xmbs), xi), $SUH($DIFF(3, xcbs), xc)),$EQ($SUX($DIFF(3, xmbs), xm),0))

% then a crossing can occur

I State($SUM($DIFF(3, xmbs), xa),$SUM($DIFF(3, xcbs), xc),Otherside(xbp)).

pick(0).pick(1).pick(2).

-State(3,3,East). % goal state

endot.list.

list(sos).State(3,3,est).end.ot.list.

% initial state

list (demodulators).(Otherside(West) = East).(Otherside(East) = West).end-ot -list.

(end of input file]

7 Weighting

OTTER maintains four lists of weight templates.

weight~list(pick.given).weight.list(purge.gen).

X Choose given clauses from the set of support.

V. Is used in conjunction with the max.weight

22

|

|

|

|

|

|

|

% parameter to discard undesirable generated% clauses.

veight.list(pick.and.purge). . Plays the roles of both pick.given and

% purge.gen (if present, neither pick.given

% nor purge.gen can be present).

veight.list(terms). % Used to orient equality literals and to. decide dynamic demodulators (lex.rpo clear).

See Section 3.2.4 for input of lists of weight templates.

7.1 Weighing Clauses and Literals

The weight of a clause is always the sum of the weights of its literals (excluding any answer

literals). The weight of a positive literal is the weight of its atom. The weight of a negativeliteral is the weight of its atom plus the value of the neg..veight parameter (Section 4.2.5).

7.2 Weighing Atoms and Terms

Atoms and terms are weighed top-down. To weigh a given term, the appropriate weightlist is searched (in the order input) for the first matching template. If a match is found,

then the subterms of the given term that match the integers in the template are weighed.The weight of the given term is the sum of the products of each integer and the weight of

its corresponding subterm, plus the second argument of the weight template. For example,the template

weight(f (g(2) ,-3) , -50)

matches the given term

f (g(h(a)) ,f (b,x)) .

The weight of the given term is (2 * (the weight of h(a))) + (-3 * (the weight of f (b,x))) +(-50). If a matching weight template is not found, then the weight of the given term is 1 -sum of the weights of the subterms. (See the flags atom..vt.max.args and term..tmax..args,Section 4.1.8 for overrides.) Note that this weighting scheme implies that if no weight

templates are present, the default weight of a term or atom is the number of variable,

constant, function, and predicate symbols (symbol count).

Variables in weight templates are generic. A variable in a weight template will matchany variable (and only variables) in the given term. As a consequence, it is never necessaryto use different variable names in a weight template. For example, weight(f (x, x) , -7)

matches the term f (uv), and veight(x,32) match-s all variables.

Warning: The two occurrences of symbol f in the term f(f.x) are treated by OTTERas different symbols because they have different arities. The weight template weight(f ,0) applies to the second occurrence but not to the first. (This warning applies only if thecommand clear(check..arity) has been issued.)

23

The default weight of an answer literal is 0, but templates can be used to assign weightsto answer literals. The parameter negweight never applies to answer literals.

Because of an inadequacy of the input parser, weight templates cannot contain in-

fix equality atoms-the template weight ((a = b), 0) will not be accepted. Instead,one must use the internal system name and write the atom in prefix form-the templateweight($eq..infix(a,b), 0) matches the atom (a = b).

If one wishes to have a weight template containing a Skolem function or constant that isgenerated by OTTER, one must first make a short trial run to find out how the formulas are

Skolemized, then return to the input file and insert the weight list after the formula lists.

8 Answer Literals

The main use of answer literals is to record, during a search for a refutation, instantiationsof variables in input clauses. For example, if the theorem under consideration states thatan object exists, then the denial of the theorem contains a variable, and an answer literalcontaining the variable can be appended to the denial. If a refutation is found, then theempty clause has an answer literal that contains the object whose existence has just been

proved.

Any literal whose predicate symbol starts with $ans, $Ans, or $ANS is an answer literal.Most routines-including the ones that count literals and decide whether a clause is positiveor negative-ignore any answer literals. The inference rules insert, into the children, theappropriate instances of any answer literals in the parents. If factoring is enabled, OTTERdoes attempt to factor answer literals.

9 Meta-experimental Features

This section describes some features that are new, not well tested, and/or not well docu-mented.

9.1 Linked UR-Resolution

OTTER 2.0 has an inference rule, linked..ursres, which is an application of the linkedinference principle [16] to UR-resolution. As this manual is written, there is not yet anydocumentation. The inference rule is still evolving and is highly experimental. For cur-rent information on the status of linked UR-resolution, send e-mail to [email protected],veroff~unmvax.cs.unm.edu, and [email protected].

9.2 Atom (Literal) Demodulation

Atoms, as well as terms, are demodulated.

" The negation symbol cannot occur in a demodulator.

24

" Because of an inadequacy of the input parser, demodulators to rewrite infix equal-ities must use the internal (prefix) system name $eqinf ix. For example, the de-

modulator ((f (x,y) = x) = (y = 0)) must be written ($eqinf ix(f (x,y),x)$eq.infix(y,0))

" Things work as expected if the right side of a literal demodulator is one of the Booleanconstants $T or $F.

9.3 Conditional Demodulation

A conditional demodulator has the form

$CONDITIONAL (condition, alpha, beta).

It is read "if condition then alpha = beta". A conditional demodulator is applied onlyif the instantiated condition evaluates to $T. In other respects, it behaves as a regulardemodulator. Examples are (member and gcd are defined in Section 6)

$CONDITIONAL($ATOMIC(x), conjunctive.normalform(x), x).$CONDITIONAL(member(gcd(4,x),y), f(xy), g(y)).$CONDITIONAL($GT($NEXT.CL.NUM,1000), e(x,x), junk).

9.4 Another Demodulation Trick

A new feature, activated by the set.special unary command, allows OTTER to avoid one ofthe problems caused by the lack of associative-commutative matching during demodulation.The feature is useful when an associative-commutative function and an inverse are present,as in rings. Without this feature, the following lex command and demodulators

lex([O,a,b,c,d,e,g(x),f(x,x)]).

list(demodulators).(f(x,y) * f(y,x)).(f(xf(y,z)) * f(y,f(x,z))).(f(x,g(x)) * 0).(f(x,f(g(x),y)) * f(O,y)).(f(0,x) = x).end.of.list.

will cause the expression

f(f(f(g(b),a),c),f(b,g(c)))

to be sorted into

f (a,f (b,f (c,f (g(b),g(c))))).

25

One would like b and g(b) to be next to each other so that they could be canceled by oneof the inverse demodulators. The new feature accomplishes just that. The new command

setspecial.unary([g(x)])

causes g to be ignored during term comparisons, and the expression would be demodulated

to a. The set-special-unary command has no effect if the flag lex-rpo is set. This is an

experimental feature. Its consequences have not been well analyzed.

9.5 Introducing New Functions

When searching for a complete set of reductions, one sometimes encounters an equality thatcannot be oriented, but that can be handled by introducing a new function symbol. See [4]for more detail.

new-functions - default clear. If this flag is set, then positive equality units of a par-ticular type cause the introduction of new function symbols and equalities. If an equality

has the property that each side has at least one variable that does not occur in the otherside, then a new function symbol and two new equalities are introduced. For example,

(d(x,d(y,d(d(xx),x))) = d(d(z,z),y)) causes the introduction of(d(xd(y,d(d(x,x),x))) a ki(y)) and (d(d(z,z),y) * ki(y)). The new function sym-bol is ki; its argument list is the intersection of the variable sets of the two sides. The lexicalvalue of the new symbol is low.

9.6 Ancestor Subsumption

OTTER does not necessarily prefer short or simple proofs-it simply reports the proofs thatit finds. A new option ancestor-subsume extends the concept of subsumption to includethe derivation history, so that if two clause occurrences are logically identical, the one withfewer ancestors is preferred. The motivation is to find short proofs.

ancestor-subsume - default clear. If this flag is set, the notion of subsumption (forwardand backward) is replaced with ancestor-subsumption. Clause C ancestor-subsumes clauseD if C properly subsumes D or if C and D are variants and size(ancestorset(C)) ;size(ancestorset(D)).

9.7 Reducing max-weight on the Fly

In many searches, the number of kept clauses grows much faster than the number of givenclauses. In other words, the list sos is very large, and most of those clauses never participate

in the search. To save memory, one can use the max-eight parameter to discard many ofthe clauses that will (probably) never become given clauses.

A few searches and proofs show a phenomenon we call the complexity hump. To get a

search started, one must use complex clauses; then one can continue the search using simpler

clauses. Analogously, the first few steps in a proof are complex, and the remaining steps

are simpler. If one needs to carefully conserve memory when a complexity hump is present,

26

one can use the parameter reduce _weight-limit to change the value of max-weight after

a specified number of given clauses.

reduce-weight.limit - default 0, range [0..MAXINT]. If n (the value) is not 0, thisparameter has effect. Two integers are packed in the parameter. The last two digits specifythe new limit, and the others specify the given clause at which to make the change. Forexample, n = 3975 says to reduce max-weignt to 75 after given clause number 39.

10 Limits, Abnormal Ends (ABENDS), and Fixes

OTTER has a several compile-time limits. If a limit is exceeded, a message containing thename of the limit will appear in the output file and/or at the terminal. To raise the limit,find the appropriate definition (:def ine) in a .h or . c file, increase the limit, and recompileOTTER. (One must have his or her own copy of the source code to do this.) Some of thelimits are

MAX.NAME - Maximum number of characters in a variable, constant, function, or predicatesymbol.

MAX.BUF - Maximum number of characters in an input string (clause, formula, command,weight template, etc.).

MAX-VARS - Maximum number of distinct variables in a clause.

MAX..FS.TERM.DEPTH - Maximum depth of terms in forward subsumption discriminationtree.

MAX..ALTERM..DEPTH - Maximum depth of left arguments of equalities in demodulationdiscrimination tree.

If OTTER is using too much memory, one can decrease (down to 0) the value of thefpa..literals parameter, set the for..sub.Ipa flag to switch forward subsumption indexingfrom discrimination tree to FPA indexing, and use weighting to discard (more) generatedclauses.

11 Summary of the Options and Their Defaults

clear(inputsosjfirst). clear(knuth.bendix).clear(sos.queue). clear(lex.rpo).

clear(sos.stack). clear(dynamic.demod.lex.dep).

set (print.given). clear(lex.order.vars).

clear(binary.res). clear(for.sub~fpa).clear(hyper.res). clear(no.fapl).clear(ur.res). clear(nojfanl).clear(para.into).

clear(parajfrom). set(check.arity).

clear(demod.inf). clear(prolog.style.variables).

27

set(para.from.left).set (para.from.right).set(para.into.left).set (para. into.right).clear(para.from.vars).clear(para.into.vars).clear(para.from.units.only).clear(para.into.units.only).

clear(para.skip.skolem).clear(para.ones.rule).clear(para.all).

clear(very.verbose).

clear (order.eq).clear(sort.literals).clear(delete.identicalnested.skolem)

set(for.sub).clear(unit.deletion).set(print.kept).

set(print..proof s).set(back. sub).set (print.back.sub).clear(f actor).

set(demod.history).clear(demod.linear).clear(demod.out.in).

clear(dollareval).

clear(dynamic.demod).clear(dynamicdemod.all).set (print.nev.demod).clear(back.demod).

set(print.back.demod).set(symbol..elim).

clear(process~input).

clear(simplifyfol).clear(bird-print).

clear(free.allmem).clear(atomwtmax.args).

clear(term.wt.max.args).

clear(print~listsatend).

clear(really.delete.clauses).

clear(prog.synthesis).

clear(ancestorsubsume).

clear(new..functions).clear(linkedurres).

clear(linkedurtrace).

assign(report,0).

assign(max.seconds,0).assign(max.gen,0).

assign (max.kept, 0).assign(max.given,0).assign(max.mem,0).

assign (maxJliterals, 0).assign(max.Weight ,0).assign(fpa~literals,3).assign(fpaterms,3).assign(demod~limit ,100).assign (max.proof s .1).assign(negweight ,0).

assign(stats.level,2).ass ign(reduce.weight.limit,0).

assign(max.ur.depth,5).assign(maxur.deductionsize,20).

Acknowledgements

Ross Overbeek, being the chief designer of OTTER's predecessors AURA and LMA/ITP, wasan important influence during the design of OTTER. I am also grateful to the users ofprevious versions of OTTER who reported bugs and made many other valuable comments.

28

References

[1] C.-L. Chang and R. C.-T. Lee. Symbolic Logic and Mechanical Theorem Proving.Academic Press, New York, 1973.

[2] N. Dershowitz. Termination of rewriting. Journal of Symbolic Computation, 3:69-116,1987.

[3] J.-P. Jouannaud, editor. Rewriting Techniques and Applications, Springer- Verlag Lec-

ture Notes in Computer Science, Vol. 202, New York, 1985. Springer-Verlag.

[4] D. Kapur and H. Zhang. Proving equivalence of different axiomatizations of free groups.Journal of Automated Reasoning, 4(3):331-352, 1988.

[5] D. Kapur and H. Zhang. RRL: Rewrite rule laboratory user's manual. Technical Report89-03, Department of Computer Science, University of Iowa, 1989.

[6] D. Knuth and P. Bendix. Simple word problems in universal algebras. In J. Leech,editor, Computational Problems in Abstract Algebras. Pergamon Press, 1970.

[7] Donald Loveland. Automated Theorem Proving: A Logical Basis. North-Holland,Amsterdam, 1978.

[8] E. Lusk and R. Overbeek, editors. Proceedings of the 9th International Conference onAutomated Deduction, Springer-Verlag Lecture Notes in Computer Science, Vol. 310,New York, 1988. Springer-Verlag.

[9] Ewing Lusk and Ross Overbeek. The automated reasoning system ITP. Tech. ReportANL-84/27, Argonne National Laboratory, Argonne, 1l., April 1984.

[10] William McCune. Skolem functions and equality in automated deduction. PreprintMCS-P136-0290, Argonne National Laboratory, Argonne, l., February 1990.

[11] J. Siekmann, editor. Proceedings of the 8th International Conference on AutomatedDeduction, Springer- Verlag Lecture Notes in Computer Science, Vol. 230, New York,1986. Springer-Verlag.

[12] Brian Smith. Reference manual for the environmental theorem prover: An incarnationof AURA. Tech. Report ANL-88-2, Argonne National Laboratory, Argonne, 1l., March

1988.

[13] L. Wos, F. Pereira, R. Boyer, J Moore, W. Bledsoe, L. Henschen, B. Buchanan,G. Wrightson, and C. Green. An overview of automated reasoning and related fields.Journal of Automated Reasoning, 1(1):5-48, 1985.

[14] Larry Wos. Automated Reasoning: 33 Basic Research Problems. Prentice-Hall, Engle-wood Cliffs, N.J., 1988.

[15] Larry Wos, Ross Overbeek, Ewing Lusk, and James Boyle. Automated Reasoning:Introduction and Applications. Prentice-Hall, Englewood Cliffs, N.J., 1984.

29

[16] Larry Wos, Robert Veroff, Brian Smith, and William McCune. The linked inferenceprinciple II: The user's view. In R. Shostak, editor, Proceedings of the 7th Conference

on Automated Deduction, Springer-Verlag Lecture Notes in Computer Science, Vol.

170, pages 316-332, New York, 1984. Springer-Verlag.

30