Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded...

22
Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus

description

Scientific Theories Scientific theories about a domain contain: Concepts, examples, definitions, hypotheses, explanations, etc. e.g. chemistry:acids Concepts: Acid, Base, Salt Hypothesis: Acid + Base  Salt + Water Experiments for plausibility/evidence Reaction pathways for explanation

Transcript of Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded...

Page 1: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Automatic Generation of First Order Theorems

Simon ColtonUniversities of Edinburgh and York

Funded by EPSRC grant GR/M98012 and the Calculemus Network

Page 2: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Overview of TalkAutomated Theory Formation Principles Implementation in the HR system Applications

Application to Theorem Generation HR adds to the TPTP library HR becomes a MathWeb service

Future Directions

Page 3: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Scientific TheoriesScientific theories about a domain contain: Concepts, examples, definitions, hypotheses, explanations, etc.

e.g. chemistry:acids Concepts: Acid, Base, Salt Hypothesis: Acid + Base Salt + Water Experiments for plausibility/evidence Reaction pathways for explanation

Page 4: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Theories in Pure Mathematics

Concepts have examples and definitionsHypotheses are “conjectures”Explanations are proofs Conjectures become “theorems”e.g pure maths:group theory Concepts: cyclic groups, Abelian groups Conjecture: cyclic groups are Abelian Examples provide empirical evidence Proof for explanation

Page 5: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

HR: Theory Formation CycleStart with background knowledge

user-supplied axioms + concepts

1. Invent a new concept (machine learning)2. Look for conjectures empirically (d-mining)3. Prove the conjectures (theorem proving)4. Disprove the conjectures (model generation)5. Assess all concepts w.r.t. new concept

1. Invent a new concept Build it from the most interesting old concepts

Page 6: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Inventing New ConceptsTen General Production Rules (PR) Work in all domains (math + non math) Build new concept from one (or two) old

onesExample: Abelian groups Given: [G,a,b,c] : a*b=c Compose PR: [G,a,b,c] : a*b=c & b*a=c Exists PR: [G,a,b] : c (a*b=c & b*a=c) Forall PR: [G] : a b ( c (a*b=c & b*a=c))

Page 7: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Making ConjecturesTheory formation step Attempt to invent a new concept

Concept has same examples as previous one HR makes an equivalence conjecture

Concept has no examples HR makes a non-existence conjecture

HR can also make implication conjectures Examples of one concept are all examples of

another concept

Page 8: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Proving TheoremsHR relies on third party theorem proversEquivalence conjectures: Sets of implication conjectures From which prime implicates are extracted E.g. a (a*a=a a=id) a*a=a a=id, a=id a*a=aHR uses the Otter theorem prover William McCune Only uses this for finite algebras

Page 9: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Disproving Non-Theorems

Any conjectures which Otter can’t prove HR looks for a counterexample Using the MACE model generator Also written by William McCune Other possibilities: CAS, CSPCounterexamples are added to the theory Fewer similar non-theorems are made later

Page 10: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Assessing Interestingness

New concepts from interesting old onesConcepts measured in terms of: Intrinsic values, e.g. complexity of definition Relational values, e.g. novelty of

categorisationConcepts also assessed by conjectures Quality, quantity of conjectures involving

conc.Conjectures also assessed Difficulty of proof (proof length from Otter) Surprisingness (of lhs and rhs definitions)

Page 11: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Bootstrapping ATF Cycle

Page 12: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Applications of ATFMachine Learning Learn concept definitions: e.g. seq. ext. Theory for prediction tasks Theory for puzzle generationConstraint Satisfaction Problems Conjectures: induced constraints Concepts: implied constraintsMathematical Discovery Exploration of new domains Invention of Integer Sequences (NWN)

Page 13: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Application to ATPBig project: using ATF to improve ATPSub-project: Using AFT to assess ATP programsCompare first order ATP programs Using a large set of HR’s conjecturesFacilitate comparison: Using MathWeb (Zimmer,Franke,…) Using SystemOnTPTP (Sutcliffe)

Page 14: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

First AttemptAim: add to the TPTP library 5882 test problems for first order provers Otter, SPASS, E, Vampire, etc. New provers are tested using TPTPHR produced 46,000 group conjectures In ten minutes.Around 200 of these were worthy of TPTP All provable by SPASS in 120 seconds 153 provable by only SPASS and E only 42 provable by only SPASS

Page 15: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Example TheoremOtter and E could not prove this:

x y (( z (inv(z)=x & z*y=x) & u (x*u=y & v (v*x=u & inv(v)=x)))( a (inv(a)=x & a*y=x) & b (b*y=x & inv(b)=y)))

[about pairs of identity elements]

Page 16: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Interface of HR into MathWeb

MathWeb project in SaarbrückenHas access to many first order ATP progs. E, Otter, SPASS, Vampire, Bliksem, …Idea: HR passes conjectures to MathWeb MathWeb translates conjectures using tptp2x MathWeb calls the proversInterface Via sockets at the moment Later by XMLRPC for better standardization

Page 17: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Additional ImplementationBy Zimmer, Colton and FrankeChanges to HR Improvements in quantity of theorems Ability to write conjectures in TPTP

formatChanges to MathWeb Calling one prover after another (1000s of times in a row) Quicker interaction with tptp2x Integration of the E system

Page 18: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

ExperimentsPossible experiments: Which one proves most of HR’s theorems

1st

Compare the average times How many timeouts for each proverWatch this space for results….. Saturday: 9000 group theory theorems

proved by SPASS, E & Otter, before a crash!Preliminary (unsurprising) result Average times: SPASS < E < Otter

Page 19: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Future Work: MathWeb #1Try HR on more provers in MathWeb Vampire, BliksemOffer HR as a new MathWeb service User says: “Give me 1,000 theorems which SPASS and E take over 10 secs. to prove”Interface HR and model generators in MW Use MACE, etc. to disprove theoremsInterface HR and CSP, CAS in MW Infinite Group theory with Bundy and Sorge

Page 20: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Future Work: MathWeb #2Aim: Beat SPASS…… SPASS is too good for HR in group theory 46,000 theorems and SPASS proved them all!

Part two of my Calculemus project: With Jacques Calmet & Clemens Ballarin in Karlsruhe HR invents new domains Adds and constrains new operators for finite algebras

“Grow” difficult theorems from prime implicates

Page 21: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Future Work: HR ProjectColton: Express HR as a ML program Try domains other than maths

Walsh: Integrate HR With every maths program ever written

Bundy: Build an automated mathematician

Page 22: Automatic Generation of First Order Theorems Simon Colton Universities of Edinburgh and York Funded by EPSRC grant GR/M98012 and the Calculemus Network.

Web PagesMathweb: www.mathweb.org

HR: www.dai.ed.ac.uk/~simonco/research/hr

NumbersWithNames program: www.machine-creativity.com/programs/nwn

Demonstration: Tomorrow @ 2pm? Room 208.