Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)
-
Upload
eugenia-greer -
Category
Documents
-
view
226 -
download
6
description
Transcript of Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)
![Page 1: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/1.jpg)
Some Neat Results From Assignment 1
![Page 2: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/2.jpg)
Assignment 1:Negative Examples (Rohit)
![Page 3: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/3.jpg)
Assignment 1:Noisy Observations (Nick)
Z: true feature vectorX: noisy observationX ~ Normal(z, s2)We need to compute P(X|H)
Φ: cumulative density fnof Gaussian
![Page 4: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/4.jpg)
Assignment 1:Noisy Observations (Nick)
![Page 5: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/5.jpg)
Guidance on Assignment 3
![Page 6: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/6.jpg)
Guidance: Assignment 3 Part 1
matlab functions in statistics toolbox
betacdf, betapdf, betarnd, betastat, betafit
![Page 7: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/7.jpg)
Guidance: Assignment 3 Part 2
You will explore the role of the priors.The Weiss model showed that priors play an important role when
observations are noisy
observations don’t provide strong constraints
there aren’t many observations.
![Page 8: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/8.jpg)
Guidance: Assignment 3 Part 3
Implement model a bit like Weiss et al. (2002)Goal: infer motion (velocity) of a rigid shape from observations at two instances in time.
Assume distinctive features that make it easy to identify the location of the feature at successive times.
![Page 9: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/9.jpg)
Assignment 2 Guidance
Bx: the x displacement of the blue square (= delta x in one unit of time)
By: the y displacement of the blue squareRx: the x displacement of the red squareRy: the y displacement of the red squareThese observations are corrupted by measurement noise.
Gaussian, mean zero, std deviation σ
D: direction of motion (up, down, left, right) Assume only possibilities are one unit of motion in any direction
![Page 10: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/10.jpg)
Assignment 2: Generative Model
Same assumptions for Bx, By.
Rx conditioned on D=up isdrawn from aGaussian
![Page 11: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/11.jpg)
Assignment 2 Math
)()|()|()|()|(~),,,|(
)()|,,,(~),,,|(
)()|,,,()()|,,,(),,,|(
),,,()()|,,,(),,,|(
DPDBypDBxpDRypDRxpByBxRyRxDP
DPDByBxRyRxpByBxRyRxDP
ePeByBxRyRxpDPDByBxRyRxpByBxRyRxDP
ByBxRyRxPDPDByBxRyRxpByBxRyRxDP
e
Conditional independence
![Page 12: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/12.jpg)
Assignment 2 Implementation
)...|(~),,,|( dDrxRxpbyBybxBxryRyrxRxdDP
)()|()|()|()|(~),,,|( DPDBypDBxpDRypDRxpByBxRyRxDP
)...,;( 2ddrxGaussian
...)2/)(exp(21 22
2
dd
d
rx
Quiz: do we need worry about the Gaussian density function normalization term?
![Page 13: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/13.jpg)
Introduction To Bayes Nets
(Stuff stolen fromKevin Murphy, UBC, and
Nir Friedman, HUJI)
![Page 14: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/14.jpg)
What Do You Need To Do Probabilistic Inference In A Given Domain?
Joint probability distribution over all variables in domain
![Page 15: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/15.jpg)
Qualitative part Directed acyclic graph(DAG)• Nodes: random vars. • Edges: direct influence
Quantitative part
Set of conditional probability distributions
0.9 0.1e
be
0.2 0.8
0.01 0.990.9 0.1
bebb
e
BE P(A | E,B)Family of Alarm
Earthquake
Radio
Burglary
Alarm
Call
Compact representation of joint probability distributions via conditional independence
Together
Define a unique distribution in a factored form
)|()|(),|()()(),,,,( ACPERPEBAPEPBPRCAEBP
Bayes Nets (a.k.a. Belief Nets)
Figure from N. Friedman
![Page 16: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/16.jpg)
What Is A Bayes Net?
Earthquake
Radio
Burglary
Alarm
Call
A node is conditionally independent of itsancestors given its parents.
E.g., C is conditionally independent of R, E, and Bgiven A
Notation: C? R,B,E | A
Quiz: What sort of parameter reduction do we get?
From 25 – 1 = 31 parameters to 1+1+2+4+2=10
![Page 17: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/17.jpg)
Conditional Distributions Are Flexible
E.g., Earthquake and Burglary might have independent effectson Alarm
A.k.a. noisy-or
where pB and pE are alarm probabilitygiven burglary and earthquake alone
This constraint reduces # free parameters to 8!
Earthquake Burglary
Alarm
B E P(A|B,E)
0 0 0
0 1 pE
1 0 pB
1 1 pE+pB-pEpB
![Page 18: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/18.jpg)
Domain: Monitoring Intensive-Care Patients• 37 variables• 509 parameters …instead of 237
PCWP CO
HRBP
HREKG HRSAT
ERRCAUTERHRHISTORY
CATECHOL
SAO2 EXPCO2
ARTCO2
VENTALV
VENTLUNG VENITUBE
DISCONNECT
MINVOLSET
VENTMACHKINKEDTUBEINTUBATIONPULMEMBOLUS
PAP SHUNT
ANAPHYLAXIS
MINOVL
PVSAT
FIO2PRESS
INSUFFANESTHTPR
LVFAILURE
ERRBLOWOUTPUTSTROEVOLUMELVEDVOLUME
HYPOVOLEMIA
CVP
BP
A Real Bayes Net: Alarm
Figure from N. Friedman
![Page 19: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/19.jpg)
More Real-World Bayes Net Applications
“Microsoft’s competitive advantage lies in its expertise in Bayesian networks”-- Bill Gates, quoted in LA Times, 1996
MS Answer Wizards, (printer) troubleshootersMedical diagnosisSpeech recognition (HMMs)Gene sequence/expression analysis Turbocodes (channel coding)
![Page 20: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/20.jpg)
Why Are Bayes Nets Useful?
Factored representation may have exponentially fewer parameters than full joint
Easier inference (lower time complexity)
Less data required for learning (lower sample complexity) Graph structure supports
Modular representation of knowledge
Local, distributed algorithms for inference and learning
Intuitive (possibly causal) interpretation
Strong theory about the nature of cognition or the generative process that produces observed data Can’t represent arbitrary contingencies among variables, so theory can be rejected by data
![Page 21: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/21.jpg)
Reformulating Naïve Bayes As Graphical Model
D
Rx Ry Bx By
),,,(/),,,,(),,,|(
),,,,(),,,(
)|()|()|()|()(),,,,(
ByBxRyRxPByBxRyRxDPByBxRyRxDP
ByBxRyRxDpByBxRyRxp
DBypDBxpDRypDRxpDPByBxRyRxDp
D
Marginalizing over D
Definition of conditional probability
survive
Age Class Gender
![Page 22: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/22.jpg)
Review: Bayes Net
Nodes = random variablesLinks = expression of joint distribution
Compare to full joint distribution by chain rule
Earthquake
Radio
Burglary
Alarm
Call
![Page 23: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/23.jpg)
Bayesian Analysis
Make inferences from data using probability models about quantities we want to predict
E.g., expected age of death given 51 yr old
E.g., latent topics in document
E.g., What direction is the motion? Set up full probability model that characterizes distribution over all quantities (observed and unobserved)
incorporates prior beliefs Condition model on observed data to compute posterior distribution
1.Evaluate fit of model to data
adjust model parameters to achieve better fits
![Page 24: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/24.jpg)
Inference
• Computing posterior probabilities– Probability of hidden events given any evidence
• Most likely explanation– Scenario that explains evidence
• Rational decision making– Maximize expected utility– Value of Information
• Effect of intervention– Causal analysis
Earthquake
Radio
Burglary
Alarm
Call
Radio
Call
Figure from N. Friedman
Explaining away effect
![Page 25: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/25.jpg)
Conditional Independence
A node is conditionally independentof its ancestors given its parents.
Example?
What about conditionalindependence between variablesthat aren’t directly connected?
e.g., Earthquake and Burglary?
e.g., Burglary and Radio?
Earthquake
Radio
Burglary
Alarm
Call
![Page 26: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/26.jpg)
d-separationCriterion for deciding if nodes are conditionally independent.
A path from node u to node v is d-separated by a node z if the path matches one of these templates:
u z v
u z v
u z v
u z v
z
z
z
observed
unobserved
![Page 27: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/27.jpg)
d-separationThink about d-separation as breaking a chain.If any link on a chain is broken, the whole chain is broken
u z v
u z v
u z v
u z v
z
u
u
u
u
v
v
v
v
x z y
x z y
x z y
x z y
z
![Page 28: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/28.jpg)
d-separation Along Paths
Are u and v d-separated?
u z v
u z v
u z v
u z v
z
u vz z
u vzz
u vzz
d separated
d separated
Not d separated
![Page 29: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/29.jpg)
Conditional Independence
Nodes u and v are conditionally independent given set Z if all (undirected) paths between u and v are d-separated by Z.
E.g.,
u v
z
z
z
![Page 30: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/30.jpg)
PCWP CO
HRBP
HREKG HRSAT
ERRCAUTERHRHISTORY
CATECHOL
SAO2 EXPCO2
ARTCO2
VENTALV
VENTLUNG VENITUBE
DISCONNECT
MINVOLSET
VENTMACHKINKEDTUBEINTUBATIONPULMEMBOLUS
PAP SHUNT
ANAPHYLAXIS
MINOVL
PVSAT
FIO2
PRESS
INSUFFANESTHTPR
LVFAILURE
ERRBLOWOUTPUTSTROEVOLUMELVEDVOLUME
HYPOVOLEMIA
CVP
BP
![Page 31: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/31.jpg)
PCWP CO
HRBP
HREKG HRSAT
ERRCAUTERHRHISTORY
CATECHOL
SAO2 EXPCO2
ARTCO2
VENTALV
VENTLUNG VENITUBE
DISCONNECT
MINVOLSET
VENTMACHKINKEDTUBEINTUBATIONPULMEMBOLUS
PAP SHUNT
ANAPHYLAXIS
MINOVL
PVSAT
FIO2
PRESS
INSUFFANESTHTPR
LVFAILURE
ERRBLOWOUTPUTSTROEVOLUMELVEDVOLUME
HYPOVOLEMIA
CVP
BP
![Page 32: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/32.jpg)
Sufficiency For Conditional Independence: Markov Blanket
The Markov blanket of node u consists of the parents, children, and children’s parents of u
P(u|MB(u),v) = P(u|MB(u))
u
![Page 33: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/33.jpg)
Probabilistic Models
Probabilistic models
Directed Undirected
Graphical models
Alarm networkState-space modelsHMMsNaïve Bayes classifierPCA/ ICA
Markov Random FieldBoltzmann machineIsing modelMax-ent modelLog-linear models
(Bayesian belief nets) (Markov nets)
![Page 34: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/34.jpg)
Turning A Directed Graphical Model Into An Undirected Model Via Moralization
Moralization: connect all parents of each node and remove arrows
![Page 35: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/35.jpg)
Toy Example Of A Markov Net
X1 X2
X5
X3
X4
e.g., X1 ? X4, X5 | X2, X3Xi ? Xrest | Xnbrs
Potential function
Partition function
Maximal clique: largest subset of vertices such that each pairis connected by an edge
Clique
1 2 3 3
![Page 36: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/36.jpg)
A Real Markov Net
•Estimate P(x1, …, xn | y1, …, yn)• Ψ(xi, yi) = P(yi | xi): local evidence likelihood• Ψ(xi, xj) = exp(-J(xi, xj)): compatibility matrix
Observed pixels
Latent causes
![Page 37: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/37.jpg)
Example Of Image Segmentation With MRFs
Sziranyi et al. (2000)
![Page 38: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/38.jpg)
Graphical Models Are A Useful Formalism
E.g., feedforward neural net with noise, sigmoid belief net
Hidden layer
Input layer
Output layer
![Page 39: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/39.jpg)
Graphical Models Are A Useful Formalism
E.g., Restricted Boltzmann machine (Hinton) Also known as Harmony network (Smolensky)
Hidden units
Visible units
![Page 40: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/40.jpg)
Graphical Models Are A Useful FormalismE.g., Gaussian Mixture Model
![Page 41: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/41.jpg)
Graphical Models Are A Useful Formalism
E.g., dynamical (time varying) models in which data arrives sequentially or output is produced as a sequence
Dynamic Bayes nets (DBNs) can be used to model such time-series (sequence) data
Special cases of DBNs include Hidden Markov Models (HMMs) State-space models
![Page 42: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/42.jpg)
Hidden Markov Model (HMM)
Y1 Y3
X1 X2 X3
Y2
Phones/ words
acoustic signal
transitionmatrix
Gaussianobservations
![Page 43: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/43.jpg)
State-Space Model (SSM)/Linear Dynamical System (LDS)
Y1 Y3
X1 X2 X3
Y2
“True” state
Noisy observations
![Page 44: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/44.jpg)
Example: LDS For 2D Tracking
Q3
R1 R3R2
Q1 Q2
X1
X1 X2
X2
X1 X2
y1
y1 y2
y2
y2y1
oo
o o
sparse linear-Gaussian system
![Page 45: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/45.jpg)
Kalman Filtering(Recursive State Estimation In An LDS)
Y1 Y3
X1 X2X3
Y2
Estimate P(Xt|y1:t) from P(Xt-1|y1:t-1) and yt
•Predict: P(Xt|y1:t-1) = sXt-1 P(Xt|Xt-1) P(Xt-1|y1:t-1)•Update: P(Xt|y1:t) / P(yt|Xt) P(Xt|y1:t-1)
![Page 46: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/46.jpg)
Mike’s Project From Last Year
G
X
studenttrial
α
P
δ
problemIRT model
![Page 47: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/47.jpg)
Mike’s Project From Last Year
X
studenttrial
L0
T
τ
G S
BKT model
![Page 48: Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)](https://reader035.fdocuments.net/reader035/viewer/2022062523/5a4d1ad17f8b9ab059971572/html5/thumbnails/48.jpg)
Mike’s Project From Last Year
X
γ σ
studenttrial
L0
T
τ
α
P
δ
problem
η
G S
IRT+BKT model