INDUCTIVE LOGIC PROGRAMMING Jiangbo Dang Vincent Ellerby Bingyu Zhu.

INDUCTIVE LOGIC PROGRAMMING

Jiangbo DangVincent Ellerby

Bingyu Zhu

Machine Learning What can you use it for

Pattern recognition faces, digits, speech

Bioinformatics gene finding

Internet spam filtering, search engines

Prediction stock market

Machine Learning

Things learn when they change their behavior in a way that makes them perform better in the future

A system capable of the autonomous acquisition and integration of knowledge

Machine Learning A computer program is said to learn

from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Learn = E(T, P) if T(P) E

ML cont’d Hypothesis Space

This box represents the set of all instances X

Each point is an instance, i.e, performance measure

The circles are hypothesis

The points inside a circle correspond to the instances that hypothesis classifies as positive.

X1

X2

Inductive Logic Programming Approach to machine learning where

definitions of relations are induced from positive and negatives examples.

Logic used a hypothesis language Result is Prolog program

Hypothesis is made and refined to find target hypothesis

ILP cont’d Given

Set of positive examples E+ and negative examples E-

Background knowledge BK, such that E+ cannot be derived from BK alone

Find hypothesis H All examples in E+ can be derived from BK and H No examples in E- can be derived from BK and H

ILP cont’d Complete hypothesis

Covers all positive examples

Consistent hypothesis Does not cover any negative examples

H must be complete and consistent

ILP Example Task: Find a definition of predicate

has_daughter( X), known as target predicate

In terms of:

Predicates parent, male, and female

Such that It is true for all given positive examples Not true for all given negative examples

ILP Example cont’d Background knowledge is provided by the

programmer Ex., parent( X, Y), male( X), female( X) defining a

family relation, given as backliteral( male( X), [X]) in Prolog

Positive example ex( has_daughter( tom))

Negative example neg( has_daughter( pam))

Refinement Start with overly general hypothesis

that is complete but inconsistent

Refinement takes hypothesis H1 and produces more specific hypothesis H2

H2 must cover a subset of cases covered by H1

Refinement Graphhas_daughter( X).

has_daughter( X):- has_daughter( X):- has_daughter( X):- male( Y). female( Y). parent( Y, Z). has_daughter( X):- has_daughter( X):- has_daughter( X):- male( Y). female( Y). parent( X, Z).

parent( S, T).

… … … … … has_daughter( X):- parent( X, Z), female( U). has_daughter( X):- parent( X, Z), female( Z).

Figure 19.2: Prolog Programming for Artificial Intelligence by Ivan Bratko

Two types of refinements in graph

Matching two variables in the clause

Example:has_daughter( X) :- parent( Y, Z)

refined tohas_daughter( X) :- parent( X, Z)

This is done by matching X = Y

Two types of refinements in graph

Adding a background literal to the body of the clause

Example: has_daughter( X).

refined to has_daughter( X) :- parent( Y, Z).

MINIHYPER PROGRAM

MINIHYPERHypothesis=[Clause1,Clause2,…]Clause=[Head,BodyLiteral1,BodyLiteral2,…]/[Var1,Var2,

…]

Example:pred(X,Y) :- parent(X,Y).pred(X,Z) :- parent(X,Y),pred(Y,Z).

[ [pred(X1,Y1),parent(X1,Y1)]/[X1,Y1], [pred(X2,Z2),parent(X2,Y2),pred(Y2,Z2)]/[X2,Y2,Z2] ]

BACKGROUND KNOWLEDGE

BackGround Knowledge

backliteral(parent(X,Y), [X,Y]). %A background literal with vars. [X,Y]

backliteral(male(X), [X]).

backliteral(female(X), [X]).

prolog_predicate( parent(_,_)). % Goal executed directly by Prolog

prolog_predicate( male(_)). prolog_predicate( female(_)).

BACKGROUND KNOWLEDGE parent( pam, bob). parent( tom, bob). parent( tom, liz). parent( bob, ann). parent( bob, pat). parent( pat, jim). parent( pat, eve).

male( tom). male( bob). male( jim).

female( pam). female( liz). female( ann). female( pat). female( eve).

TRAINING EXAMPLES % Positive examples

ex( has_daughter(tom)). %Tom has a daughter ex( has_daughter(bob)). ex( has_daughter(pat)).

% Negative examples

nex( has_daughter(pam)). %Pam doen't have a daughter

nex( has_daughter(jim)).

start_hyp( [ [has_daughter(X)] / [X] ] ). % Starting hypothesis

A interpreter for Hypotheses

prove( Goal, Hypo, Answer):

% Answer = yes, if Goal derivable from Hypo in at most D steps

% Answer = no, if Goal not derivable % Answer = maybe, if search

terminated after D steps inconclusively

What “Answer=maybe” means Answer = maybe, if search terminated

after D steps inconclusively A. The interpreter would get into infinite

loop if without proof length limit B. The interpreter would find a proof with

proof length greater than D C. The interpreter would fails with proof

length greater than D then backtrack

A loop-avoiding interpreter prove( Goal, Hypo, Answer) :- max_proof_length( D), prove( Goal, Hypo, D, RestD), (RestD >= 0, Answer = yes % Proved ; RestD < 0, !, Answer = maybe % Maybe, but it looks like inf.

loop ). prove( Goal, _, no). % Otherwise Goal definitely cannot be

proved

% prove( Goal, Hyp, MaxD, RestD): % MaxD allowed proof length, RestD 'remaining length' after

proof; % Count only proof steps using Hyp

prove( G, H, D, D) :- D < 0, !. % Proof length overstepped

prove( [], _, D, D) :- !.

A loop-avoiding interpreter(cont.)

prove( [G1 | Gs], Hypo, D0, D) :- !, prove( G1, Hypo, D0, D1), prove( Gs, Hypo, D1, D).

prove( G, _, D, D) :- prolog_predicate( G), % Background predicate in

Prolog? call( G). % Call of background predicate

prove( G, Hyp, D0, D) :- D0 =< 0, !, D is D0-1 % Proof too long ; D1 is D0-1, % Remaining proof length member( Clause/Vars, Hyp), % A clause in Hyp copy_term( Clause, [Head | Body] ), % Rename variables in

clause G = Head, % Match clause's head with goal prove( Body, Hyp, D1, D). % Prove G using Clause

MINIHYPER

OutputHypothesis

NewHypothesis

StartHypothesis

Reach MaxD?

Complete?

Consistent?

No

No No

MaxD=MaxD+1

Yes

Yes

Refine_hyp

MINIHYPER induce( Hyp) :- iter_deep( Hyp, 0,). % Iterative deepening starting with max. depth 0

iter_deep( Hyp, MaxD,Depthreshold) :- write( 'MaxD = '), write( MaxD), nl, start_hyp( Hyp0), complete( Hyp0), % Hyp0 covers all positive

examples depth_first( Hyp0, Hyp, MaxD) % Depth-limited depth-first search ; NewMaxD is MaxD + 1, (NewMaxD > Depthreshold,!,fail ; iter_deep( Hyp, NewMaxD,Depthreshold)).

MINIHYPER % depth_first( Hyp0, Hyp, MaxD): % refine Hyp0 into consistent and complete Hyp in at

most MaxD steps depth_first( Hyp, Hyp, _) :- consistent( Hyp).

depth_first( Hyp0, Hyp, MaxD0) :- MaxD0 > 0, MaxD1 is MaxD0 - 1, refine_hyp( Hyp0, Hyp1), complete( Hyp1), % Hyp1 covers all positive

examples depth_first( Hyp1, Hyp, MaxD1).

MINIHYPER

complete( Hyp) :- % Hyp covers all positive examples

not(ex( E), % A positive example once( prove( E, Hyp, Answer)), % Prove it with

Hyp Answer \== yes). % Possibly not proved

consistent( Hyp) :- % Hypothesis does not possibly cover any negative

example not(nex( E), % A negative example once( prove( E, Hyp, Answer)), % Prove it with

Hyp Answer \== no). % Possibly provable

MINIHYPER % refine_hyp( Hyp0, Hyp): % refine hypothesis Hyp0 into Hyp

refine_hyp( Hyp0, Hyp) :- conc( Clauses1, [Clause0/Vars0 | Clauses2], Hyp0), % Choose Clause0 from Hyp0 conc( Clauses1, [Clause/Vars | Clauses2], Hyp), % New hypothesis refine( Clause0, Vars0, Clause, Vars). % Refine the Clause

MINIHYPER % refine( Clause, Args, NewClause, NewArgs): % refine Clause with arguments Args giving NewClause with

NewArgs

% 1. Refine by unifying arguments

refine( Clause, Args, Clause, NewArgs) :- conc( Args1, [A | Args2], Args), % Select a variable A member( A, Args2), % Match it with another one conc( Args1, Args2, NewArgs).

% 2. Refine by adding a literal

refine( Clause, Args, NewClause, NewArgs) :- length( Clause, L), max_clause_length( MaxL), L < MaxL, backliteral( Lit, Vars), % Background knowledge

literal conc( Clause, [Lit], NewClause), % Add literal to body of

clause conc( Args, Vars, NewArgs). % Add literal's variables

MINIHYPER % Default parameter settings

max_proof_length( 6). % Max. proof length, counting calls to 'non-prolog'

pred.

max_clause_length( 3). % Max. number of literals in a clause

RESULTS

Experiments with MINIHYPER

has_daughter(A) :- parent(A, B), female(B).

predecessor(A, B) :- parent(A, C), predecessor(C, B).

predecessor(A, B) :- parent(A, B). predecessor(A,B) with the ‘guard’

literal atom(X) in the background literals.

First example Target predicate: has_daughter(A) :-

parent(A, B), female(B).

BK (Background knowledge)parent(X, Y).male(X).female(X).

Positive and negative examples

ex( has_daughter(tom)). % Tom has a daughter

ex( has_daughter(bob)). ex( has_daughter(pat)).

nex( has_daughter(pam)). %Pam doesn't have

nex( has_daughter(jim)). % daughter

has_daughter(A) :- parent(A, B), female(B).

?- induce(H).MaxD = 0MaxD = 1MaxD = 2MaxD = 3MaxD = 4

H = [[has_daughter(_G258), parent(_G258, _G290), female(_G290)]/[_G258, _G290]]

Refinement graph

less one negative example nex(has_daughter(pam)).

?- induce(H).MaxD = 0MaxD = 1MaxD = 2 H = [[has_daughter(_G252),

parent(_G252, _G284)]/[_G252, _G284]] ;

less the other negative example nex(has_daughter(jim)).

?- induce(H).MaxD = 0MaxD = 1MaxD = 2MaxD = 3MaxD = 4 H = [[has_daughter(_G261), parent(_G261,

_G293), female(_G293)]/[_G261, _G293]] ;

Second example

predecessor(A, B) :- parent(A, B).

predecessor(A, B) :- parent(A, C), predecessor(C, B).

Results

?- induce(H,8).MaxD = 0MaxD = 1MaxD = 2MaxD = 3MaxD = 4MaxD = 5MaxD = 6MaxD = 7MaxD = 8

Continue

H = [[predecessor(_G313, _G314), [parent(_G313, _G381)], [predecessor(_G381, _G314)]]/[_G313, _G381, _G314], [predecessor(_G331, _G332), [parent(_G331, _G332)]]/[_G331, _G332]]

Execution time: more than 3 hours

Last example: Add atom(X).

Execution time: about 20 mins Definition of predecessor

H=[[predecessor(A, B), [atom(A), parent(A, C)], [atom(C), predecessor(C, B)]]/[A, C, B], [predecessor(D, E), [atom(D), parent(D, E)]]/[D, E]]

Results

?- induce(H,8).MaxD = 0MaxD = 1MaxD = 2MaxD = 3MaxD = 4MaxD = 5MaxD = 6MaxD = 7MaxD = 8

Continue

H = [[predecessor(_G313, _G314), [atom(_G313), parent(_G313, _G381)], [atom(_G381), predecessor(_G381, _G314)]]/[_G313, _G381, _G314], [predecessor(_G331, _G332), [atom(_G331), parent(_G331, _G332)]]/[_G331, _G332]]

Refinement graph

Continue

Experiments with HYPER

predecessor(A, B):- [parent(A, B)].

predecessor(A, C):- [parent(A, B)], [predecessor(B, C)].

Results:

Hypotheses generated: 35404 Hypotheses refined: 5907 To be refined: 2710

Continue H = [[predecessor(_G480383,

_G480384), [parent(_G480383, _G480384)]]

/[_G480383:item, _G480384:item], [predecessor(_G480416, _G480417), [parent(_G480416, _G480426)], [predecessor(_G480426,_G480417)]]

/[_G480416:item, _G480426:item,_G480417:item]]

Comparison for the second relation

MINIHYPER MINIHYPERAdd atom()

HYPER

Execution Time

More than 3 hours

20 minutes About 8 minutes

Hypotheses generated

2420900 289728 35404

Hypotheses refined

28 5907

Summary

QUESTIONS?

INDUCTIVE LOGIC PROGRAMMING Jiangbo Dang Vincent Ellerby Bingyu Zhu.

Documents

Transcript of INDUCTIVE LOGIC PROGRAMMING Jiangbo Dang Vincent Ellerby Bingyu Zhu.