Random Artificial Incorporation of Noise in a Learning Classifier System Environment.

Random Artificial Incorporation of Noise in a Learning Classifier System

Environment

Ryan J. Urbanowicz, Nicholas A. Sinnott-Armstrong, and Jason H. Moore

Dartmouth Medical School

GECCO Dublin, Ireland - 2011

Genetic Epidemiology• Association Study (Case/Control)

• Single Nucleotide Polymorphism (SNP)

• Allele & Genotype

Subject #1

-- AGGTCA ---- AGGTCA --

Subject #2

-- AGGTCA ---- AGCTCA --

Subject #3

-- AGCTCA ---- AGCTCA --

Two alleles (G and C)

Three genotypes (GG, GC, CC)

Encode genotypes (0, 1, 2)

“One SNP at a time approach”

“Complex systems approach”

Epistasis

SNP1

Disease

SNP2 SNP3

Disease Disease

SNP1 SNP2 SNP3

Disease

Main Effects

AA(.25) Aa(.5) aa(.25)

BB(.25) 0 1 0 0.5

Bb(.5) 1 0 1 0.5

bb(.25 0 1 0 0.5

0.5 0.5 0.5

AA(.25) Aa(.5) aa(.25)

SNP X 0 0 1

SNP 1

SNP

2

Marginal Penetrance(s)

Genetic Heterogeneity

G8G7

G6

Sample population

G5G4G3

G2G1

• Evidence of GH in… – Autism

– Schizophrenia

– Breast Cancer

– Alzheimer disease

– Tuberous sclerosis

– Cystic Fibrosis

– Asthma

– And many, many others…

Learning Classifier Systems

Urbanowicz 2009 LCS: A Complete Introduction, Review, and Roadmap. Journal of Artificial Evolution and Applications

Population [P]

Environment

Detectors

Match Set [M]

Action Set [A] Effectors

Action

Selection

Genetic

Algorithm

Learning Strategy

Credit Assignment

Covering

Prediction Array

Classifiern = Condition : Action :: Parameter(s)

Discovery Component

Performance Component

Reinforcement Component

1

2

3

4

5

6

8

10

Reward

Action Performed

Classifierm

Classifiera

[A]t-1Classifiert-1

7

9

• Autonomous Robotics• Complex Adaptive Systems• Function Approximation• Classification• Data Mining

• Effective Generalization – Maximizing rule generality while preserving accuracy. (Testing Acc. = Training Acc.)

• Examples of LCS Generalization– Generalization Hypothesis (Wilson 1995)– Action Set GA & Subsumption (Wilson 1998)– Hierarchical Selection operator (Bacardit/Garrell 2002)– Windowing (Bacardit et. al. 2004)– Minimum Description Length (Bacardit/Garrell 2007)– Ensemble LCS (Gao et. al. 2005)

• Noisy Problem domains –– Over-fitting become a particularly important problem– Classification Noise - < 100% testing accuracy possible– Attribute Noise – attributes which contribute nothing to testing accuracy.

Effective Generalization

1 0 # # # 0 0 # # # - 10 2 # # 1 # # # # # - 0

• Given: a noisy problem, LCS’s with accuracy-based fitness will tend to over-fit (learn structure idiosyncratic to the training dataset).

• Consider: datasets with a small sample size would be particularly susceptible to this (online learning repeatedly considers same samples).

• If: we probabilistically incorporate variable noise into the incoming training instances, than every epoch of learning, the Michigan LCS is exposed to a randomly permuted version of the original dataset. Leads to an artificially inflated sample size.

• Hypothesis: The incorporation of low levels of random classification noise will discourage over-fitting, and promote effective generalization

Hypothesis:

RAIN Random Artificial Incorporation of Noise

Environment

0210110220 -- 1

Population [P]

Match Set [M] Correct Set [C]

0210120220 -- 1

Temporal Models• Pm = Maximum Permutation Prob.•Pc = Current Permutation Prob.•Im = Maximum Iteration•Ic = Current Iteration

•Uniform

•Linear

•Inverse Linear

•Gaussian

Power Estimation

1 0 # # # 0 0 # # # - 10 2 # # 1 # # # # # - 0# # # 1 0 2 1 # # # - 0# # 1 0 # # # # # # - 00 1 # # # # # # 2 # - 1# # 0 0 # # # 0 0 1 - 11 2 0 0 1 # # # # 2 - 11 1 1 # # # # # # # - 02 # 2 # # 1 # 0 # # - 0# 1 # # # # # # # # - 1# # 1 1 # # # 2 # # - 0

5 5 5 6 8 8 9 8 9 9

2 1 0

2 0 1 0

1 1 0 1

0 0 1 0

SNP 2

SNP

1

2 1 0

2 0 0 0

1 0 0 0

0 0 0 1

SNP 4

SNP

3

Model 1

Model 2

1 2 3 4 . . .

Power at the CV level:> 50% of the 10 CV runs

• Idea: Strategically and automatically avoid destructively adding noise to attributes more likely to be important to classification.

• Probabilistically targets attributes which are more frequently generalized (rather than specified)

• Pc = Pm

• Two Implementations (Weight lists generated differently)– Targeted Generality (TG)– Targeted Fitness Weighted Generality (TFWG)

• Noise Generation (same for both implementations)– First epoch – no noise– Weight list recalculated at the end of each epoch.– Subtract minimum weight in list from all values in list.– Determine number of attributes to be permuted (Random < Pm)– Choose attribute - Roulette wheel selection

Targeted RAIN

Experimental Evaluation• UCS –

– Iterations = [50000,100000, 200000, 500000]

– Micro Pop. Size = 1600

– Other parameters are default

– Track: Training Acc. Testing Acc. Generality, Macro Pop. Size, Run Time

Power to find both or a single underlying model

– Pm = 0.001, 0.01, 0.05, 0.1

• Each Dataset

– Main effect free

– 2X two-locus epistatic interation

– 20 Attributes

– Balanced

– Minor allele frequencies = 0.2

– Heritability = 0.2

– Mix Ratio = 50:50

– Sample Sizes [200, 400, 800, 1600]

– 20 Replicates

• 80 Simulated Datasets + (10 Fold CV) 800 runs of UCS

G1 G2 G3 G4

• Incorporation of RAIN with equal attribute probability is ineffective.

• Targeted RAIN was able to reduce over-fitting (sig. decrease in training accuracy without reducing testing accuracy.

• Improvements in power (not significant) suggest that RAIN may improve UCS’s ability to identify predictive attributes.

• Try RAIN on datasets with much larger numbers of attributes

• Consider the combination of targeted RAIN with temporal models

• Explore a larger range of Pm values

• Implement RAIN with an adaptive Pm

Conclusions & Future Work

Acknowledgements

Jason Moore &Nicholas A. Sinnott-Armstrong

Funding SupportNIH: AI59694, LM009012, LM010098William H. Neukom 1964 Institute for

Computational Science at Dartmouth College

Quaternary Rule Representation

[ #, 0, 1, 2]

Random Artificial Incorporation of Noise in a Learning Classifier System Environment.

Health & Medicine

Transcript of Random Artificial Incorporation of Noise in a Learning Classifier System Environment.