Automated discovery in math

23
Automated discovery in Automated discovery in math math Machine learning techniques (GP, ILP, Machine learning techniques (GP, ILP, etc.) have been successfully applied in etc.) have been successfully applied in science science How about mathematics? Can they be used How about mathematics? Can they be used to discover interesting relationships to discover interesting relationships in mathematical “data”? in mathematical “data”? This is an exploration of using GP for This is an exploration of using GP for that purpose that purpose Specifically, using GP to automatically Specifically, using GP to automatically discover Euler’s identity (V – E + F = discover Euler’s identity (V – E + F = 2) from a fairly limited amount of data 2) from a fairly limited amount of data

description

Automated discovery in math. Machine learning techniques (GP, ILP, etc.) have been successfully applied in science How about mathematics? Can they be used to discover interesting relationships in mathematical “data”? This is an exploration of using GP for that purpose - PowerPoint PPT Presentation

Transcript of Automated discovery in math

Page 1: Automated discovery in math

Automated discovery in Automated discovery in mathmath• Machine learning techniques (GP, ILP, etc.) Machine learning techniques (GP, ILP, etc.)

have been successfully applied in sciencehave been successfully applied in science

• How about mathematics? Can they be How about mathematics? Can they be used to discover interesting relationships used to discover interesting relationships in mathematical “data”?in mathematical “data”?

• This is an exploration of using GP for that This is an exploration of using GP for that purposepurpose

• Specifically, using GP to automatically Specifically, using GP to automatically discover Euler’s identity (V – E + F = 2) discover Euler’s identity (V – E + F = 2) from a fairly limited amount of datafrom a fairly limited amount of data

Page 2: Automated discovery in math

CubesCubes

V = 8V = 8

E = E = 1212F = 6F = 6

V – E + F = 8 – 12 + 6 V – E + F = 8 – 12 + 6 = 2 = 2

Page 3: Automated discovery in math

TetrahedraTetrahedra

V = 4 V = 4

E = 6E = 6

F = 4F = 4

V – E + F = 4 – 6 + 4 = 2V – E + F = 4 – 6 + 4 = 2

Page 4: Automated discovery in math

OctahedraOctahedra

V = 6V = 6

E = 12E = 12

F = 8F = 8

V – E + F = 6 – 8 + 12 = 2V – E + F = 6 – 8 + 12 = 2

Page 5: Automated discovery in math

Data for Euler’s identityData for Euler’s identity

PolyhedronPolyhedron VV EE FF11 CubeCube 88 1212 6622 Triangular prismTriangular prism 66 99 5533 Pentagonal prismPentagonal prism 1010 1515 7744 Square pyramidSquare pyramid 55 88 5555 Triangular pyramidTriangular pyramid 44 66 4466 Pentagonal pyramidPentagonal pyramid 66 1010 6677 OctahedronOctahedron 66 1212 8888 TowerTower 99 1616 9999 Truncated cubeTruncated cube 1010 1515 77

Page 6: Automated discovery in math

At a glanceAt a glance

• 50 generations50 generations

• Population: 4000 ASTsPopulation: 4000 ASTs

• Generation #: 3600 (90% of population)Generation #: 3600 (90% of population)

• Maximum AST depth: 13Maximum AST depth: 13

• Ramped half-and-half initializationRamped half-and-half initialization

• 3 non-terminals: +, -, *3 non-terminals: +, -, *

• 12 terminals: V, E, F, 1, 2, …, 912 terminals: V, E, F, 1, 2, …, 9

• Crossover, no mutationCrossover, no mutation

Page 7: Automated discovery in math

Genetic algorithms (GA)Genetic algorithms (GA)

• Search a space of solution attempts Search a space of solution attempts (“individuals”)(“individuals”)

• Use natural selection to guide the Use natural selection to guide the searchsearch

• Must have a Must have a fitness functionfitness function that can that can evaluate any given individualevaluate any given individual

• Individuals procreate by exchanging Individuals procreate by exchanging (recombining) “genetic material”(recombining) “genetic material”

Page 8: Automated discovery in math

Example: SAT solvingExample: SAT solving

• Problem: Given a CNF formula P over n Problem: Given a CNF formula P over n variables xvariables x11,…,x,…,xnn, find a satisfying , find a satisfying assignmentassignment

• Search space: all n-bit stringsSearch space: all n-bit strings• Fitness measure for a given individual Fitness measure for a given individual bb11 b bnn: # of satisfied clauses in P: # of satisfied clauses in P

• Genetic operations: Genetic operations: crossovercrossover and and mutationmutation

Page 9: Automated discovery in math

aa11 … a … aj-1j-1||aajj … a … ann + b + b11 … b … bj-1j-1||bbjj … b … bnn

aa11 … a … aj-1j-1 || b bjj … b … bnn bb11 … b … bj-1j-1 || a ajj … a … ann

Crossover:Crossover:

Mutation:Mutation:

0 1 1 0 0 1 1 0 11 0 0 1 0 0 1 0 1 1 0 0 1 1 0 00 0 0 1 0 0 1

Page 10: Automated discovery in math

Generic GA algorithmGeneric GA algorithm

1.1. Construct a random initial populationConstruct a random initial population2.2. Set i := 1Set i := 13.3. If i > N then haltIf i > N then halt4.4. Compute the fitness of each individual;Compute the fitness of each individual; if the fittest solves the problem, halt.if the fittest solves the problem, halt.5.5. Create a new population:Create a new population:

1.1. Pick P – G individuals and copy themPick P – G individuals and copy them2.2. Create G new individuals by repeated Create G new individuals by repeated

applications of genetic operationsapplications of genetic operations6.6. Set i := i + 1 and go to step Set i := i + 1 and go to step 33

Parameterized over:Parameterized over: N, P, G N, P, G

Page 11: Automated discovery in math

SelectionSelection• How is an individual “picked” for How is an individual “picked” for

reproduction or copying?reproduction or copying?• Main idea: the probability that an Main idea: the probability that an

individual is selected should be individual is selected should be proportional to the individual’s fitnessproportional to the individual’s fitness

• Many ways to ensure that. One method Many ways to ensure that. One method is is tournament selectiontournament selection: : – Pick 0 < k <= P individuals randomlyPick 0 < k <= P individuals randomly– Select the fittest of the kSelect the fittest of the k

• When k = 1: No selection pressureWhen k = 1: No selection pressure• When k = P: Too much selection When k = P: Too much selection

pressurepressure

Page 12: Automated discovery in math

Genetic Programming (GP)Genetic Programming (GP)

• An instance of the generic GA schemeAn instance of the generic GA scheme

• Individuals are now programs, i.e., Individuals are now programs, i.e.,

syntactic objectssyntactic objects

• Search space is kept finite by bounding Search space is kept finite by bounding

program sizeprogram size

• Programs are represented as ASTs Programs are represented as ASTs

(abstract syntax trees)(abstract syntax trees)

Page 13: Automated discovery in math

if x > 0 then if x > 0 then y := x * xy := x * xelseelse y := z + 1 y := z + 1

ifif

>>

xx 00

::==

yy

xx xx

**

::==

yy

zz 11

++

Programs as ASTsPrograms as ASTs

ParsingParsing

Page 14: Automated discovery in math

Program structure in GPProgram structure in GP

• Programs are usually simple Herbrand Programs are usually simple Herbrand terms, i.e., functional expressionsterms, i.e., functional expressions

• AST leaves are called AST leaves are called terminalsterminals• Internal nodes are Internal nodes are non-terminalsnon-terminals• Non-terminals are function symbols Non-terminals are function symbols

(e.g. +)(e.g. +)• Terminals are constants and variablesTerminals are constants and variables• Terminals + non-terminals must be Terminals + non-terminals must be

sufficientsufficient for expressing solutions for expressing solutions

Page 15: Automated discovery in math

Viewing a functional AST as a Viewing a functional AST as a “program”“program”

++

**

xx 22

yy

The program has two “inputs”, x and y. Given The program has two “inputs”, x and y. Given

specific values for these, it produces a unique specific values for these, it produces a unique

result as outputresult as output

Page 16: Automated discovery in math

** TT33

TT11 TT22

++

TT44

TT55 TT66

--

++

++ TT33

TT55 TT66

++

TT44

TT11 TT22

--

**

Crossover pt 1Crossover pt 1 Crossover pt 2Crossover pt 2

AST CrossoverAST Crossover

ParentsParents

ChildreChildrenn

Page 17: Automated discovery in math

Initial populationInitial population

• Built randomlyBuilt randomly

• Two methods for building a random Two methods for building a random AST:AST:– FullFull method: All branches are equally long method: All branches are equally long– GrowGrow method: Different subtrees can have method: Different subtrees can have

different sizes (but less than the maximum)different sizes (but less than the maximum)

• More usual: More usual: ramped half-and-half ramped half-and-half initializationinitialization: half of the trees are built : half of the trees are built with one method, the other half with the with one method, the other half with the other methodother method

Page 18: Automated discovery in math

Problem formulationProblem formulation

• Can cast it as a standard symbolic Can cast it as a standard symbolic regression problemregression problem

• View F as a function of E and V, and View F as a function of E and V, and search space of all rational functions of search space of all rational functions of two variables (up to a max depth)two variables (up to a max depth)

• Error function: difference between Error function: difference between actual # of faces and the result actual # of faces and the result produced by the programproduced by the program

• Optimization: minimize the errorOptimization: minimize the error• Quick convergenceQuick convergence

Page 19: Automated discovery in math

Another approachAnother approach

• Search space of all identitiesSearch space of all identities• Generated as follows:Generated as follows:

I TI T11 = T = T22

T L | TT L | T11 + T + T22 | T | T11 – T – T22 | T | T11 * T * T22 L V | E | F | 1 | 2 | … | 9L V | E | F | 1 | 2 | … | 9• Any other integer can be built from 1,Any other integer can be built from 1,

…, 9 and the given non-terminals…, 9 and the given non-terminals• Identity is not a non-terminal; it can Identity is not a non-terminal; it can

only appear at the root of an ASTonly appear at the root of an AST

Page 20: Automated discovery in math

DetailsDetails

• Generate P identities randomly (using Generate P identities randomly (using ramped half-and-half initialization)ramped half-and-half initialization)

• Crossover on two identities SCrossover on two identities S11 = S = S22 and Tand T11 = T = T22::

•Mate two random subterms SMate two random subterms Sii and T and Tjj from from

each identity, producing two new subterms each identity, producing two new subterms SSii’ and T’ and Tjj’’

• If either new term is deeper than the max If either new term is deeper than the max depth, then use one of the original parentsdepth, then use one of the original parents

•Replace SReplace Sii and T and Tjj in the identities by S in the identities by Sii’ and T’ and Tjj’’

• No mutationNo mutation

Page 21: Automated discovery in math

FitnessFitness

• An identity is evaluated on a given triple of An identity is evaluated on a given triple of values for V, E, and Fvalues for V, E, and F

• Computing the fitness of an identity Computing the fitness of an identity

S = T:S = T: For each of the k data triples For each of the k data triples ½½: :

If S = T holds for If S = T holds for ½½, then give the identity a point, then give the identity a point

• Higher score, greater fitnessHigher score, greater fitness

• Maximum fitness: 9, minimum: 0Maximum fitness: 9, minimum: 0

Page 22: Automated discovery in math

ProblemProblem

• Trivially true identities can get perfect Trivially true identities can get perfect scores, e.g.: scores, e.g.:

V = VV = V1 + 2 = 5 – 31 + 2 = 5 – 3E – E + E = EE – E + E = E

•Solution: negative triples, e.g.: Solution: negative triples, e.g.: V = 0, E = 0, F = 1V = 0, E = 0, F = 1

• Trivial identities will hold for such Trivial identities will hold for such negative triples, but plausible negative triples, but plausible identities will notidentities will not

Page 23: Automated discovery in math

Fitness computationFitness computation

• To evaluate an identity S = T:To evaluate an identity S = T:•For each of the k data triples pFor each of the k data triples p: :

– Allocate a point if S = T holds for pAllocate a point if S = T holds for p– Allocate a second point if S = T does Allocate a second point if S = T does notnot

hold for the negative triplehold for the negative triple

• Maximum score: 18, minimum: 0Maximum score: 18, minimum: 0

• Also impose a penalty of Also impose a penalty of bb n/20 n/20 c c points for an identity of length n (to points for an identity of length n (to discourage excessively long discourage excessively long expressions)expressions)