Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL...

48
Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding María A. Franco, Natalio Krasnogor, Jaume Bacardit University of Nottingham, UK. ASAP Research Group, School of Computer Science [email protected] July 14, 2011 Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 1 / 25

Transcript of Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL...

Page 1: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Modelling the Initialisation Stage of the ALKRRepresentation for Discrete Domains and

GABIL Encoding

María A. Franco, Natalio Krasnogor, Jaume Bacardit

University of Nottingham, UK.ASAP Research Group,

School of Computer [email protected]

July 14, 2011

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 1 / 25

Page 2: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Problem definition

BioHEL[Bacardit et al., 2009a] is a Genetic Based MachineLearning (GBML) designed to cope with large scaledatasets[Bacardit et al., 2009b].

I Iterative Rule Learning approachI Attribute List Knowledge Representation (ALKR)I ILAS Windowing schemeI Default ruleI Smart initialisation mechanisms (covering)I GPU-based evaluation process

ProblemThe system obtains good results [Stout et al., 2008], but we do nothave a formal understanding of why, when and how this happens.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 2 / 25

Page 3: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Problem definition

BioHEL[Bacardit et al., 2009a] is a Genetic Based MachineLearning (GBML) designed to cope with large scaledatasets[Bacardit et al., 2009b].

I Iterative Rule Learning approachI Attribute List Knowledge Representation (ALKR)I ILAS Windowing schemeI Default ruleI Smart initialisation mechanisms (covering)I GPU-based evaluation process

ProblemThe system obtains good results [Stout et al., 2008], but we do nothave a formal understanding of why, when and how this happens.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 2 / 25

Page 4: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

What is the aim of this work?

The aim of this work is to model the initialisation stage of the BioHELsystem and calculate the probability of having a good initialpopulation. Two conditions should be meet[Goldberg, 2002]:

A good individual exists in an initial population (building blocks)The initial population covers the whole search space

BackgroundThese probabilities are also know as schema and covering bound.This have already being determined for XCS and the ternaryrepresentation {1,0,#} by [Butz, 2006].

ProblemModels need to be adapted for our ALKR+GABIL representation.Moreover, we want to model the impact of the BioHEL mechanismsthat are relevant in initialisation: covering and default rule.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 3 / 25

Page 5: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

What is the aim of this work?

The aim of this work is to model the initialisation stage of the BioHELsystem and calculate the probability of having a good initialpopulation. Two conditions should be meet[Goldberg, 2002]:

A good individual exists in an initial population (building blocks)The initial population covers the whole search space

BackgroundThese probabilities are also know as schema and covering bound.This have already being determined for XCS and the ternaryrepresentation {1,0,#} by [Butz, 2006].

ProblemModels need to be adapted for our ALKR+GABIL representation.Moreover, we want to model the impact of the BioHEL mechanismsthat are relevant in initialisation: covering and default rule.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 3 / 25

Page 6: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

What is the aim of this work?

The aim of this work is to model the initialisation stage of the BioHELsystem and calculate the probability of having a good initialpopulation. Two conditions should be meet[Goldberg, 2002]:

A good individual exists in an initial population (building blocks)The initial population covers the whole search space

BackgroundThese probabilities are also know as schema and covering bound.This have already being determined for XCS and the ternaryrepresentation {1,0,#} by [Butz, 2006].

ProblemModels need to be adapted for our ALKR+GABIL representation.Moreover, we want to model the impact of the BioHEL mechanismsthat are relevant in initialisation: covering and default rule.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 3 / 25

Page 7: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

1 BackgroundGABIL RepresentationAttribute List Knowledge Representation (ALKR)

2 Probabilistic modelsInitial considerationsSchema boundHow does the overlapping affects?Covering bound

3 Generalised model for x-ary attributesSchema and Covering bound

4 Conclusions and Further Work

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 4 / 25

Page 8: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does GABIL works?

The GABIL representation[Jong and Spears, 1991] is used insideALKR to represent nominal attributes.

ExampleF1 ={A,B,C} F2={O,P} F3={W,Z,X,Y}

F1 F2 F3100 01 1101

F1 is A ∧ F2 is P ∧ (F3 is W ∨ F3 is Z ∨ F3 is Y)

In GABIL, when initialising the attribute values we set the bit to 1 withprobability p and to 0 with probability 1− p

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 5 / 25

Page 9: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does GABIL works?

The GABIL representation[Jong and Spears, 1991] is used insideALKR to represent nominal attributes.

ExampleF1 ={A,B,C} F2={O,P} F3={W,Z,X,Y}

F1 F2 F3100 01 1101

F1 is A ∧ F2 is P ∧ (F3 is W ∨ F3 is Z ∨ F3 is Y)

In GABIL, when initialising the attribute values we set the bit to 1 withprobability p and to 0 with probability 1− p

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 5 / 25

Page 10: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does Attribute List Knowledge Representation works?

ALKR Classifier Example

numAtt

predicates

class

whichAtt

3

0

0.70.5

1

0.3

offsetPred 0

How do we select the attributes in the list?

ld =

{1 d <= ExpAttsExpAtts

d d > ExpAtts

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 6 / 25

Page 11: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Initial considerations for the probabilistic models

Mechanisms involved in initialisation

CoveringDefault Rule

⇒We have to consider 4initialisation scenarios

Types of attributesFully mapped attributesPartially mapped attributes.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 7 / 25

Page 12: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Initial considerations for the probabilistic models

Mechanisms involved in initialisation

CoveringDefault Rule

⇒We have to consider 4initialisation scenarios

Types of attributesFully mapped attributesPartially mapped attributes.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 7 / 25

Page 13: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Initial considerations for the probabilistic models

Mechanisms involved in initialisation

CoveringDefault Rule

⇒We have to consider 4initialisation scenarios

Types of attributesFully mapped attributesPartially mapped attributes.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 7 / 25

Page 14: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Schema bound

ProblemWe want to calculate the probability of having good classifiers orrepresentatives in an initial population. Classifiers that do not makemistakes, since they represent correctly all the specified bits in anoriginal problem rule.

ExampleConsidering the rule #10#1 with 3 values specified (k=3), the followingclassifiers are representatives: 110*1, 11011, 010*1.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 8 / 25

Page 15: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Schema bound

ProblemWe want to calculate the probability of having good classifiers orrepresentatives in an initial population. Classifiers that do not makemistakes, since they represent correctly all the specified bits in anoriginal problem rule.

ExampleConsidering the rule #10#1 with 3 values specified (k=3), the followingclassifiers are representatives: 110*1, 11011, 010*1.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 8 / 25

Page 16: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Schema bound

QuestionWhat is the probability of obtaining a representative with at least kvalues specified?

To become a representative the rule should:1 Specify at least k attributes correctly.2 The rest of the attributes should not have all 0’s.

P(rep) =2kf (ldp(1−p))k(1−ld(1−p)2)d−k

where kf is the number of fully map attributes

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 9 / 25

Page 17: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Schema bound

QuestionWhat is the probability of obtaining a representative with at least kvalues specified?

To become a representative the rule should:1 Specify at least k attributes correctly.2 The rest of the attributes should not have all 0’s.

Without using any of the mechanisms:

P(rep) =2kf (ldp(1−p))k(1−ld(1−p)2)d−k

n

where kf is the number of fully map attributes

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 9 / 25

Page 18: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Schema bound

QuestionWhat is the probability of obtaining a representative with at least kvalues specified?

To become a representative the rule should:1 Specify at least k attributes correctly.2 The rest of the attributes should not have all 0’s.

Using default rule:

P(rep) =2kf (ldp(1−p))k(1−ld(1−p)2)d−k

n−1

where kf is the number of fully map attributes

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 9 / 25

Page 19: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Schema bound

QuestionWhat happens when we use covering?

1 We sample an instance with uniform probabilities for all classes.2 We set the bits corresponding to the instance values to 1.

I It is not possible to have all 0’s anymore.

P(rep) = m (ld (1− p))k

where m is the number of classes mapped by the problem rules

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 10 / 25

Page 20: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Schema bound

QuestionWhat happens when we use covering?

1 We sample an instance with uniform probabilities for all classes.2 We set the bits corresponding to the instance values to 1.

I It is not possible to have all 0’s anymore.

P(rep) = mn (ld (1− p))k

where m is the number of classes mapped by the problem rules

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 10 / 25

Page 21: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Schema bound

QuestionWhat happens when we use covering and default rule?

1 We sample an instance with uniform probabilities for all classes.2 We set the bits corresponding to the instance values to 1.

I It is not possible to have all 0’s anymore.

P(rep) = mn−1 (ld (1− p))k

where m is the number of classes mapped by the problem rules

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 10 / 25

Page 22: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Problems used for model validation

Binary and Ternary Multiplexer problemsI k address bitsI 2k string bits (3k for ternary case)

k-Disjuntive Normal Functions[Butz and Pelikan, 2006, Franco et al., 2010].

I r disjunctive termsI d possible attributesI k represented attributes in each term

Example kDNF: d = 10, k = 3, r = 3

(¬x1 ∧ x5 ∧ x7) ∨ (x1 ∧ ¬x2 ∧ x8) ∨ (x4 ∧ ¬x5 ∧ ¬x9)

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 11 / 25

Page 23: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Problems used for model validation

Binary and Ternary Multiplexer problemsI k address bitsI 2k string bits (3k for ternary case)

k-Disjuntive Normal Functions[Butz and Pelikan, 2006, Franco et al., 2010].

I r disjunctive termsI d possible attributesI k represented attributes in each term

Example kDNF: d = 10, k = 3, r = 3

(¬x1 ∧ x5 ∧ x7) ∨ (x1 ∧ ¬x2 ∧ x8) ∨ (x4 ∧ ¬x5 ∧ ¬x9)

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 11 / 25

Page 24: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Schema bound - Model validation

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 2 4 6 8 10

P(r

ep)

k - Number of Attributes

Teoretical p=0.75Empirical p=0.75

Teoretical p=0.50Empirical p=0.50

Teoretical p=0.25Empirical p=0.25

(a) MUX - No Covering

0

0.2

0.4

0.6

0.8

1

0 2 4 6 8 10

P(r

ep)

k - Number of Attributes

Teoretical p=0.75Empirical p=0.75

Teoretical p=0.50Empirical p=0.50

Teoretical p=0.25Empirical p=0.25

(b) MUX- Covering

0

0.02

0.04

0.06

0.08

0.1

0 2 4 6 8 10

P(r

ep)

k - Number of attributes especified

Empirical p=0.75Teoretical p=0.75Empirical p=0.50

Teoretical p=0.50Empirical p=0.25

Teoretical p=0.25Empirical NoDef p=0.75

Teoretical NoDef p=0.75Empirical NoDef p=0.50

Teoretical NoDef p=0.50Empirical NoDef p=0.25

Teoretical NoDef p=0.25

(c) kDNF - No Covering

0

0.2

0.4

0.6

0.8

1

0 2 4 6 8 10

P(r

ep)

k - Number of attributes especified

Empirical p=0.75Teoretical p=0.75Empirical p=0.50

Teoretical p=0.50Empirical p=0.25

Teoretical p=0.25Empirical NoDef p=0.75

Teoretical NoDef p=0.75Empirical NoDef p=0.50

Teoretical NoDef p=0.50Empirical NoDef p=0.25

Teoretical NoDef p=0.25

(d) kDNF - Covering

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 12 / 25

Page 25: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

What have we calculated so far?

These models so far only hold for:

Problems withno-overlapping

Problems that have justone rule

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 13 / 25

Page 26: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

What happens here?

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 14 / 25

Page 27: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does the overlapping affects the probability of arepresentative?

P(niche) =P(rep)

r

1?

=ExamplesNiche (EN)

ExamplesCovered (EC)

EC = 2d(

1−(1− 2−k)r

)EN =

2d

2k

P′(rep) = 1− (1− P(niche))r

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 15 / 25

Page 28: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does the overlapping affects the probability of arepresentative?

P(niche) =P(rep)

r

1?

=ExamplesNiche (EN)

ExamplesCovered (EC)

EC = 2d(

1−(1− 2−k)r

)EN =

2d

2k

P′(rep) = 1− (1− P(niche))r

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 15 / 25

Page 29: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does the overlapping affects the probability of arepresentative?

P(niche) =P(rep)

?

1?

=ExamplesNiche (EN)

ExamplesCovered (EC)

EC = 2d(

1−(1− 2−k)r

)EN =

2d

2k

P′(rep) = 1− (1− P(niche))r

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 15 / 25

Page 30: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does the overlapping affects the probability of arepresentative?

P(niche) =P(rep)

?

1?

=ExamplesNiche (EN)

ExamplesCovered (EC)

EC = 2d(

1−(1− 2−k)r

)EN =

2d

2k

P′(rep) = 1− (1− P(niche))r

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 15 / 25

Page 31: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does the overlapping affects the probability of arepresentative?

P(niche) =P(rep)

?

1?

=ExamplesNiche (EN)

ExamplesCovered (EC)

EC = 2d(

1−(1− 2−k)r

)EN =

2d

2k

P′(rep) = 1− (1− P(niche))r

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 15 / 25

Page 32: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does the overlapping affects the probability of arepresentative?

P(niche) =P(rep)

?

1?

=ExamplesNiche (EN)

ExamplesCovered (EC)

EC = 2d(

1−(1− 2−k)r

)EN =

2d

2k

P′(rep) = 1− (1− P(niche))r

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 15 / 25

Page 33: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

How does the overlapping affects the probability of arepresentative?

P(niche) =P(rep)

2k(1− (1− 2−k)r)

1?

=ExamplesNiche (EN)

ExamplesCovered (EC)

EC = 2d(

1−(1− 2−k)r

)EN =

2d

2k

P′(rep) = 1− (1− P(niche))r

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 15 / 25

Page 34: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Validation of models considering overlapping

0 2 4 6 8 10 1

5

25

0 0.2 0.4 0.6 0.8

1

P(rep)Teoretical

Empirical r=1Empirical r=5

Empirical r=10Empirical r=20Empirical r=40

Atts esp (k)# of rules

P(rep)

(e) Base Case

0 2 4 6 8 10 1

5

25

0 0.2 0.4 0.6 0.8

1

P(rep)Teoretical

Empirical r=1Empirical r=5

Empirical r=10Empirical r=20Empirical r=40

Atts esp (k)# of rules

P(rep)

(f) Covering and Default Class

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 16 / 25

Page 35: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Covering bound

ProblemHow can we calculate the probability of covering the whole searchspace?

We need to calculate the probability of matching an instance

Base case P(match) = (1− ld + ldp)d

Covering case P(match) =(

1− ld + ld(

1+p2

))d

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 17 / 25

Page 36: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Covering bound

ProblemHow can we calculate the probability of covering the whole searchspace?

We need to calculate the probability of matching an instance

Base case P(match) = (1− ld + ldp)d

Covering case P(match) =(

1− ld + ld(

1+p2

))d

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 17 / 25

Page 37: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Covering bound

ProblemHow can we calculate the probability of covering the whole searchspace?

We need to calculate the probability of matching an instance

Base case P(match) = (1− ld + ldp)d

Covering case P(match) =(

1− ld + ld(

1+p2

))d

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 17 / 25

Page 38: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Covering bound - Model validation

(g) No covering

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20

P(m

atch

)

k - Number of Attributes

Empirical p=0.75Model p=0.75

Empirical p=0.50Model p=0.50

Empirical p=0.25Model p=0.25

(h) Covering

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20P

(mat

ch)

k - Number of Attributes

Empirical p=0.75Model p=0.75

Empirical p=0.50Model p=0.50

Empirical p=0.25Model p=0.25

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 18 / 25

Page 39: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

What happens with x-ary attributes?

What happens when the problem is not binary but has more than 2values per attribute?

Generalised models for x-ary attributesWhere t is the number of values per attribute and e is the number ofactive bits per attribute.

Example 1: 101|110|011:0⇒ t=3 e=2Example 2: 001|100|010:1⇒ t=3 e=1

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 19 / 25

Page 40: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

What happens with x-ary attributes?

What happens when the problem is not binary but has more than 2values per attribute?

Generalised models for x-ary attributesWhere t is the number of values per attribute and e is the number ofactive bits per attribute.

Example 1: 101|110|011:0⇒ t=3 e=2Example 2: 001|100|010:1⇒ t=3 e=1

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 19 / 25

Page 41: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Generalised model for x-ary attributes

Schema bound

Base case P(rep) =tkf (ldpe(1−p)t−e)k(1−ld(1−p)t)d−k

n

Covering case P(rep) = mn

(ldpe−1 (1− p)t−e−1

)k

Covering bound

Base case P(match) = (1− ld + ldp)d

Covering case P(match) =(

1− ld + ld(

1+(t−1)pt

))d

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 20 / 25

Page 42: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Generalised model for x-ary attributes

Schema bound with Default Rule

Base case P(rep) =tkf (ldpe(1−p)t−e)k(1−ld(1−p)t)d−k

n−1

Covering case P(rep) = mn−1

(ldpe−1 (1− p)t−e−1

)k

Covering bound

Base case P(match) = (1− ld + ldp)d

Covering case P(match) =(

1− ld + ld(

1+(t−1)pt

))d

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 20 / 25

Page 43: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Generalised model for x-ary attributes

Schema bound validation (with ternary multiplexer problems)

(i) No covering

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

1 2 3 4 5 6

P(r

ep)

k - Number of Attributes

Empirical p=0.75Model p=0.75

Empirical p=0.50Model p=0.50

Empirical p=0.25Model p=0.25

(j) Covering

0

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3 4 5 6P

(rep

)k - Number of Attributes

Empirical p=0.75Model p=0.75

Empirical p=0.50Model p=0.50

Empirical p=0.25Model p=0.25

≈ 5 times more probability of generatinga good individual when using covering

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 21 / 25

Page 44: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Generalised model for x-ary attributes

Covering bound validation (with ternary multiplexer problems)

(k) No covering

0

0.2

0.4

0.6

0.8

1

2 4 6 8 10 12 14

P(m

atch

)

k - Number of Attributes

Empirical p=0.75Model p=0.75

Empirical p=0.50Model p=0.50

Empirical p=0.25Model p=0.25

(l) Covering

0

0.2

0.4

0.6

0.8

1

2 4 6 8 10 12 14P

(mat

ch)

k - Number of Attributes

Empirical p=0.75Model p=0.75

Empirical p=0.50Model p=0.50

Empirical p=0.25Model p=0.25

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 22 / 25

Page 45: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Conclusions

The presented models explains what is the probability of havinga good initial population in BioHEL considering de ALKRrepresentation and other initialisation mechanisms.We also presented a generalisation of the model for x-aryattributes and adjusted the probability for problems withoverlapping.These models explain the benefits of BioHEL initialisationmechanisms giving a further understanding of how the BioHELsystem works.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 23 / 25

Page 46: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Further Work

Simplify the current models to make them less dependent onproblem parameters not known beforehand.Model the reproductive opportunity and learning time of BioHEL.Derive boundaries for the population size and other user-definedparameters in BioHEL.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 24 / 25

Page 47: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Modelling the Initialisation Stage of the ALKRRepresentation for Discrete Domains and

GABIL Encoding

María A. Franco, Natalio Krasnogor, Jaume Bacardit

University of Nottingham, UK.ASAP Research Group,

School of Computer [email protected]

July 14, 2011

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 25 / 25

Page 48: Modelling the Initialisation Stage of the ALKR Representation for Discrete Domains and GABIL Encoding

Bacardit, J., Burke, E., and Krasnogor, N. (2009a).Improving the scalability of rule-based evolutionary learning.Memetic Computing, 1(1):55–67.

Bacardit, J., Stout, M., Hirst, J. D., Valencia, A., Smith, R., and Krasnogor, N. (2009b).Automated alphabet reduction for protein datasets.BMC Bioinformatics, 10(1):6.

Butz, M. V. (2006).Rule-Based Evolutionary Online Learning Systems: A Principled Approach to LCS Analysis and Design, volume 109 ofStudies in Fuzziness and Soft Computing.Springer.

Butz, M. V. and Pelikan, M. (2006).Studying XCS/BOA learning in boolean functions: structure encoding and random boolean functions.In GECCO ’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation, pages 1449–456,New York, NY, USA. ACM.

Franco, M. A., Krasnogor, N., and Bacardit, J. (2010).Analysing biohel using challenging boolean functions.In GECCO ’10: Proceedings of the 12th annual conference comp on Genetic and evolutionary computation, pages1855–1862, New York, NY, USA. ACM.

Goldberg, D. E. (2002).The Design of Innovation: Lessons from and for Competent Genetic Algorithms.Kluwer Academic Publishers, Norwell, MA, USA.

Jong, K. D. and Spears, W. M. (1991).Learning concept classification rules using genetic algorithms.In Proceedings of the 12th international joint conference on Artificial intelligence - Volume 2, pages 651–656, Sydney,New South Wales, Australia. Morgan Kaufmann Publishers Inc.

Stout, M., Bacardit, J., Hirst, J. D., and Krasnogor, N. (2008).Prediction of recursive convex hull class assignments for protein residues.Bioinformatics, 24(7):916–923.

Franco et al. (University of Nottingham) Modelling Initialisation using ALKR+GABIL July 14, 2011 25 / 25