BOA (Bayesian Optimization Algorithm)
description
Transcript of BOA (Bayesian Optimization Algorithm)
BOA (Bayesian Optimization Algorithm)
Hsuan Lee
for Dummies
Hsuan Lee @ NTUEE2
References Martin Pelikan: Hierarchical Bayesian Optimization Algorithm,
StudFuzz 170, 31–48 (2005) //BOA Martin Pelikan and D. E. Goldberg: Hierarchical Bayesian
Optimization Algorithm, Studies in Computational Intelligence (SCI) 33, 63-90 (2006) //hBOA
Cooper, G. F. and Herskovits, E. H. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309–347.
Heckerman, D., Geiger, D., and Chickering, D. M. (1994). Learning Bayesian networks: The combination of knowledge and statistical data. Technical Report MSR-TR-94-09, Microsoft Research, Redmond, WA.
Friedman, N., and Goldszmidt, M. (1999). Learning Bayesian networks with local structure. In Jordan, M. I., (Ed.), Graphical models, pp. 421–459. MIT, Cambridge, MA
2010.10.07
Hsuan Lee @ NTUEE3
Generating Offspring
Mutate Crossover Asexual
Reproduction Use ONE fit
chromosome. Change slightly to form an offspring
Eg. ES
Sexual Reproduction Use a PAIR of
fit chromosome. Take parts of each to form an offspring
Eg. sGA, DSMGA
EDA Group
Reproduction Use a GROUP of
fit chromosome to build a model. Sample the model to generate an offspring
Eg. DSMGA(?) for SpinGlass, BOA
2010.10.07
Hsuan Lee @ NTUEE4
Bayesian Optimization Algorithm Pseudo Code
Bayesian Optimization Algorithm (BOA)t 0;generate initial population P(0);while (not done) {
SELECT population of promising solution S(t);
BUILD Bayesian network (BN) B(t) from S(t);
SAMPLE B(t) to generate O(t); incorporate O(t) into P(t);
//REPLACEMENTt t+1;
} 2010.10.07
Hsuan Lee @ NTUEE5
Bayesian Optimization Algorithm
Selection
Learning Bayesian Network
Sampling Bayesian Network
Replacement
Evaluation
Until Termination
Initialization
2010.10.07
Hsuan Lee @ NTUEE6
Learning Bayesian Network Bayesian Network
A BN is a directed acyclic graph (DAG) An edge on Bayesian Network AB implies that the
occurrence of A has an effect on the probability of B’s occurrence. A is a parent of B. B is conditionally dependent on A.
Two nodes are assumed to be conditionally independent if there is not an edge between them
𝑝 (𝑋 1=𝑥1 , 𝑋 2=𝑥2 ,…, 𝑋𝑛=𝑥𝑛)=∏𝑖=1
𝑛
𝑝 (𝑋 𝑖=𝑥 𝑖∨𝑋 𝑗=𝑥 𝑗 𝑓𝑜𝑟 h𝑒𝑎𝑐 𝑋 𝑗 h h𝑤 𝑖𝑐 𝑖𝑠 𝑎𝑝𝑎𝑟𝑒𝑛𝑡𝑜𝑓 𝑋 𝑖)
2010.10.07
Hsuan Lee @ NTUEE7
Learning Bayesian Network Bayesian Network
S R T FF F 0.0 1.0F T 0.8 0.2T F 0.9 0.1T T 0.99 0.01
T F0.2 0.8
R T FF 0.4 0.6T 0.01 0.99
Sprinkler
Wet Grass
Rain
2010.10.07
Hsuan Lee @ NTUEE8
Learning Bayesian Network Learning Bayesian Network from data
Structure (B)To learn the structure of a BN, we need A scoring metric (or a set of scoring metrics) on
structures A search procedure
Parameters (Θ,θ) Given the structure of a BN, learning parameters is
straight forward. Maximum Likelihood (ML),
Learning parameters is easy,but learning the best BN structure is NP-Complete
2010.10.07
Hsuan Lee @ NTUEE9
Learning Bayesian Network Scoring Metrics: evaluations of a BN structure
Bayesian MetricsDetermines the likelihood of a structure given the observed data and some prior knowledgeEg. Bayesian Dirichlet Metric (BD)
Minimum Description Length MetricsEvaluate the structure according to the number of bits required to store the model and the data compressed according to the modelEg. Bayesian Information Criterion
We’ll come back to the scoring metrics later.2010.10.07
Hsuan Lee @ NTUEE10
Learning Bayesian Network The Search Procedure of a good Bayesian Network
It can be shown that finding the best Bayesian network isNP-Complete. But the best BN is not required in BOA, a good BN is enough.
Greedy Algorithm can be used to find a good BN
Greedy Algorithm of network constructioninitialize the network B (an empty network or the network of the last generation)done false;while (not done) {O all simple graph operations applicable to B;IF there exists an operation in O that improves score(B) THENop = operation from O that improves score(B) the most;apply op to B;ELSEdone true; }return B;
2010.10.07
Hsuan Lee @ NTUEE11
Learning Bayesian Network Simple Graph Operations of Bayesian Network
Edge Addition Edge Removal Edge Reversal
Rain
Radar
Wet Road
Speed
Car Cras
h
2010.10.07
Hsuan Lee @ NTUEE12
Learning Bayesian Network Learning Parameters
Maximum Likelihood (ML)
Rain
Radar
Wet Road
Speed
Car Cras
h
2010.10.07
Hsuan Lee @ NTUEE13
Sampling Bayesian Network Generate Offspring with a Bayesian Network
1. Given a Bayesian network with structure & parameters
2. Perform a topology sort on the Bayesian network, which is a directed acyclic graph (DAG)
3. Assign values to the new chromosome bit by bit in the topological sorted order. according to the parameters.
Rain
Radar
Wet Road
Speed
Car Cras
h
2010.10.07
Hsuan Lee @ NTUEE14
Bayesian Optimization Algorithm
Selection
Learning Bayesian Network
Sampling Bayesian Network
Replacement
Evaluation
Until Termination
Initialization
2010.10.07
Hsuan Lee @ NTUEE15
Scoring Metrics Revisited Minimum Description Length Metrics
Evaluate the structure according to the number of bits required to store the model and the data compressed according to the model
Bayesian Information Criterion
B: Bayesian StructureH(A|B): Conditional Entropy of A given BN: population size2010.10.07
Hsuan Lee @ NTUEE16
Scoring Metrics Revisited Bayesian Metrics
Determines the likelihood of a structure given the observed data and some prior knowledge
Bayesian Dirichlet Metric (BD)
B: Bayesian StructureD: Observed Data𝜉: Prior InformationNijk: # of Observed Data that has value k on bit i with the parent string jN’ijk: prior knowledgeΓ: Gamma Function
2010.10.07
Hsuan Lee @ NTUEE17
Scoring Metrics Revisited Bayesian Dirichlet Metric (BD)
In BOA, is set to 1 and . This reduced form of BD metric is called K2 metric.Physical meaning: all outcomes k of a given parental setup has the same probability at the beginning .
The term can be set either to a constant or set to favor simpler structures.
2010.10.07
Hsuan Lee @ NTUEE18
Scoring Metrics Revisited Decomposability of scoring metrics
In both metrics, the score of a structure only changes locally after performing a simple graph operation (by greedy search)
Only one particular term (one particular i) is changed in the entire metric
Largely simplifies the computation of the greedy search
2010.10.07
Hsuan Lee @ NTUEE19
Scoring Metrics Revisited Problems exist in both scoring metrics
In BIC, the term about model complexity confines the complexity of the Bayesian structure, resulting in over simplified structures
In BD, maximizing marginal probability leads to over-fitting, resulting in over complicated structures
A combination of both can produce favorable results
2010.10.07
hBOA
Hierarchical Bayesian Optimization Algorithm
Hsuan Lee @ NTUEE21
Hierarchical BOA (hBOA) The hierarchical version of BOA, used to solve
nearly decomposable and hierarchical problems
Three important challenges must be considered for the design of solvers of difficult hierarchical problems Decomposition
Bayesian Network Chunking
Representing partial solutions at each level compactly to enable the algorithm to effectively process partial solutions for higher order.Using local structures
Diversity MaintenanceRTR replacement
2010.10.07
Hsuan Lee @ NTUEE22
Hierarchical BOA (hBOA) Local Structure
Decision Tree, in hBOA Full Table
A=1
C=1 C=0
B=1 B=0
A=0
ABC
000 0.4 0.6001 0.4 0.6010 0.4 0.6011 0.4 0.6100 0.5 0.5101 0.6 0.4110 0.3 0.7111 0.6 0.4
2010.10.07
Hsuan Lee @ NTUEE23
Hierarchical BOA (hBOA) Benefits of building local structure
Simplifies the modelIn the case shown, 8 parameters has to be maintained for full conditional probability model table, but only 4 for decision tree
Generalizes the parental conditionIn the case shown, with the full table setting, an occurrence of ABCX=1010 contributes nothing in predicting ABCX=1110 in the future; with the local structure 1010 DOES predict 1110
A=1
C=1 C=0
B=1 B=0
A=0
2010.10.07
Hsuan Lee @ NTUEE24
Hierarchical BOA (hBOA) //EDIT Scoring Metrics: evaluations of a local
structure Bi Bayesian Metrics
In hBOA, is set to favor simpler models.
2010.10.07
Hsuan Lee @ NTUEE25
Hierarchical BOA (hBOA) //EDIT Scoring Metrics: evaluations of a local
structure Bi Minimum Description Length Metrics
2010.10.07
Hsuan Lee @ NTUEE26
Hierarchical BOA (hBOA) Search procedure for local structure (decision tree)
Greedy Algorithm of local structure (decision tree) constructioninitialize the structure Bi (a one-node tree that represents all parental strings)// top-downBranch (Bi , Πi);return Bi;
Branch (T, P)IF exists elements in P THENchoose π ∈ P that best splits the decision tree T;Left Child = Branch (Tπ=1 , P- π);Right Child = Branch (Tπ=0 , P- π);// bottom-upIF the score given by Tπ=1 and Tπ=0 is worse than T THENmerge Tπ=1 and Tπ=0 back into T;ELSE Left Child = Right Child = NIL;return T;
2010.10.07
Hsuan Lee @ NTUEE27
Hierarchical BOA (hBOA) Search procedure for local structure (Decision
Tree)demonstration
A=1
C=1
B=1 B=0
C=0
B=1 B=0
A=0
B=1
C=1 C=0
B=0
C=1 C=0
2010.10.07
Hsuan Lee @ NTUEE28
Hierarchical BOA (hBOA) Modified network construction for hBOA
Greedy Algorithm of network with local structure constructioninitialize the network B (an empty network or the network of the last generation)done false;while (not done) {O all simple graph operations applicable to B;optimize every structure in O with local structure;IF there exists an operation in O that improves score(B) THENop = operation from O that improves score(B) the most;apply op to B;ELSEdone true; }return B;
2010.10.07
Hsuan Lee @ NTUEE29
Hierarchical BOA (hBOA) Sampling a Bayesian network with local
structure1. Topology sort2. Assign values according to local structures,
instead of full conditional probability tables
A=1
C=1 C=0
B=1 B=0
A=0Rain
Radar
Wet Road
Speed
Car Cras
h
2010.10.07
Hsuan Lee @ NTUEE30
Some Thoughts about BOA/hBOA Use causal Bayesian network to solve an
acausal problem
Are arrows really needed? “Markovian” Optimization Algorithm, MOA? Adopt the idea of Bayesian Dirichlet Metric.
Rain
Radar
Wet road
Speed
Car Cras
h
2010.10.07
End of PresentationThank You! Thank Yu!