CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian...
Transcript of CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian...
![Page 1: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/1.jpg)
CSC 665-1: Advanced Topics in Probabilistic Graphical Models
Introduction and Course Overview
Instructor: Prof. Jason Pacheco
![Page 2: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/2.jpg)
Outline
• Motivating examples of representation
• Efficient computation on graphical models
• Overview of course topics
• Course details (attendance, grading, etc.)
![Page 3: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/3.jpg)
Why Graphical Models?
Structure simplifies both representation and computation
RepresentationComplex global phenomena arise by simpler-to-specify local interactions
ComputationInference / estimation depends only on subgraphs (e.g. dynamic programming,
belief propagation, Gibbs sampling)
![Page 4: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/4.jpg)
Why Graphical Models?
Structure simplifies both representation and computation
RepresentationComplex global phenomena arise by simpler-to-specify local interactions
ComputationInference / estimation depends only on subgraphs (e.g. dynamic programming,
belief propagation, Gibbs sampling)
We will discuss inference later, but let’s focus on representation…
![Page 5: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/5.jpg)
Outline
• Motivating examples of representation
• Efficient computation on graphical models
• Overview of course topics
• Course details (attendance, grading, etc.)
![Page 6: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/6.jpg)
Problem: Given 3D protein backbone structure, estimate orientation of every side chain molecule.
Protein Side Chain Prediction
Solution: Just physics of atomic interaction. Easy, right!?
Backbone + Side Chains Side Chain Rotation
![Page 7: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/7.jpg)
Protein Side Chain Prediction
Complex phenomena specified by simpler atomic interactions
Configuration
LikelihoodsGraphical Model
Nodes represent side
chain orientations
Edges represent
atomic interaction
![Page 8: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/8.jpg)
Protein Side Chain Prediction
By exploiting graphical model structure we can scale computation
to large macromolecules
[ Pacheco and Sudderth, ICML 2015 ]
![Page 9: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/9.jpg)
Pose Estimation
Problem: Estimate orientation / shape / pose of human figure from an image
PCA Shape
Graphical Model Image (Data /
Observation)
Skin Tone
Model encodes likelihood of shape / pose / image
consistency (e.g. skin color)
[ Pacheco, et al., NeurIPs 2014 ]
![Page 10: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/10.jpg)
Frame t
Frame t+1
Pose Tracking
By composing single-frame model with temporal dynamics and motion prior
we can do video tracking…
t t+1
Motion Prior
![Page 11: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/11.jpg)
Kinematic Hand Tracking
![Page 12: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/12.jpg)
Hidden Markov Models
Sequential models of discrete quantities of interest
Example: Part-of-speech Tagging:
= NP-V-Det-N-P-Det-N
= “I shot an elephant in my pajamas.”
Data
Unknowns [ Source: nltk.org ]
= b-ey-z-th-ih-er-em Bayes’ Theorem
Example: Speech Recognition
[ Source: Bishop, PRML ]
![Page 13: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/13.jpg)
Dynamical Models
Sequential models of continuous quantities of interest
Observation y
State x
Example: Nonlinear Time Series
Example: Multitarget Tracking
![Page 14: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/14.jpg)
State-Space Models
Intracortical Brain-Computer Interface
[ Milstein, Pacheco, et al., NeurIPs 2017 ]
![Page 15: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/15.jpg)
Outline
• Motivating examples of representation
• Efficient computation on graphical models
• Overview of course topics
• Course details (attendance, grading, etc.)
![Page 16: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/16.jpg)
Why Graphical Models?
Structure simplifies both representation and computation
RepresentationComplex global phenomena arise by simpler-to-specify local interactions
ComputationInference / estimation depends only on subgraphs (e.g. dynamic programming,
belief propagation, Gibbs sampling)
![Page 17: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/17.jpg)
Max-Product Belief PropagationComputation in Graphical Models
This style of computation generalizes to all graphical models…
Example algorithms
Belief propagation
Gibbs sampling
Particle filtering
Viterbi decoder for HMMs
Kalman filter (marginal inference)
Key Idea: Local computations only depend on the statistics of
the current node and neighboring interactions
![Page 18: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/18.jpg)
Viterbi Decoder
Efficiently computes MAP estimate for
state-space model by passing messages
forward and backward along chain.
Summary of
![Page 19: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/19.jpg)
Viterbi Decoder
Efficiently computes MAP estimate for
state-space model by passing messages
forward and backward along chain.
![Page 20: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/20.jpg)
Viterbi Decoder
Efficiently computes MAP estimate for
state-space model by passing messages
forward and backward along chain.
![Page 21: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/21.jpg)
Viterbi Decoder
Efficiently computes MAP estimate for
state-space model by passing messages
forward and backward along chain.
![Page 22: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/22.jpg)
Viterbi Decoder
Efficiently computes MAP estimate for
state-space model by passing messages
forward and backward along chain.
![Page 23: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/23.jpg)
Viterbi Decoder
Efficiently computes MAP estimate for
state-space model by passing messages
forward and backward along chain.
![Page 24: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/24.jpg)
Viterbi Decoder
Efficiently computes MAP estimate for
state-space model by passing messages
forward and backward along chain.
![Page 25: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/25.jpg)
Outline
• Motivating examples of representation
• Efficient computation on graphical models
• Overview of course topics
• Course details (attendance, grading, etc.)
![Page 26: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/26.jpg)
Course Overview
Variational
Inference
Advanced
Markov chain
Monte Carlo
Bayesian
Nonparametrics
Bayesian
Optimization
Bayesian Deep
Learning
Efficient methods
for approximate
posterior inference
Techniques for
obtaining
asymptotically
exact inference
while avoiding
local optima
A class of
probability models
where model
complexity is
inferred from the
data
Probabilistic
methods for global
optimization of
smooth functions
Probabilistic
uncertainty models
for deep learning
We will cover five primary topics…
![Page 27: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/27.jpg)
Variational Inference
Jensen’s Inequality
(for concave functions)
Uses Jensen’s inequality to bound quantities of inference
Variational Lower Bound
• Partition Function
• Marginal likelihood Variational Approximation
![Page 28: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/28.jpg)
Advanced Markov Chain Monte Carlo
Advanced MCMC techniques reduce sample complexity and avoid getting stuck in local energy minima
[Source: Syed et al, 2019]
Example: Parallel tempering exchange replicates across multiple MCMC chains running in (embarrassingly) parallel
![Page 29: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/29.jpg)
Bayesian Nonparametrics
Amount and nature of data drive model complexity
50 Data Points 1000 Data PointsCountably infinite
component parameters
Component
assignment
Data
Example: Dirichlet process mixture models a distribution over an infinite number of mixture components
![Page 30: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/30.jpg)
Bayesian Optimization
Global optimization of random functions: .
[Source: Ryan Adams]
![Page 31: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/31.jpg)
Bayesian Optimization
Iteratively updates distribution over function value (regression)
[Source: Ryan Adams]
![Page 32: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/32.jpg)
Bayesian Optimization
The function is well-approximated around the minimizer
[Source: Ryan Adams]
![Page 33: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/33.jpg)
Bayesian Deep Learning
Neural networks are graphical models too…
…but they are typically not probabilistic
![Page 34: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/34.jpg)
Bayesian Deep Learning
Combines deep learning with uncertainty models
Data Gaussian Mixture Model
(GMM)
GMM Structured
Variational Autoencoder[Source: Johnson et al., NIPS 2016]
![Page 35: CSC 665-1: Advanced Topics in Probabilistic Graphical Modelspachecoj/courses/csc... · Bayesian Deep Learning Efficient methods for approximate posterior inference Techniques for](https://reader033.fdocuments.net/reader033/viewer/2022050216/5f6215c7fad79e1dea5fa4d2/html5/thumbnails/35.jpg)
Outline
• Motivating examples of representation
• Efficient computation on graphical models
• Overview of course topics
• Course details (attendance, grading, etc.)
Now for the bulleted lists of stuff…