Introduction to Graphical Models Brookes Vision Lab Reading Group.

30
Introduction to Graphical Models Brookes Vision Lab Reading Group

Transcript of Introduction to Graphical Models Brookes Vision Lab Reading Group.

Page 1: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Introduction to Graphical Models

Brookes Vision Lab Reading Group

Page 2: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Graphical Models

• To build a complex system using simpler parts.

• System should be consistent

• Parts are combined using probability

• Undirected – Markov random fields

• Directed – Bayesian Networks

Page 3: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Overview

• Representation

• Inference

• Linear Gaussian Models

• Approximate inference

• Learning

Page 4: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Causality : Sprinkler “causes” wet grass

Representation

Page 5: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Conditional Independence

• Independent of ancestors given parents• P(C,S,R,W) = P(C) P(S|C) P(R|C,S) P(W|C,S,R)• = P(C) P(S|C) P(R|C) P(W|S,R)

• Space required for n binary nodes– O(2n) without factorization– O(n2k) with factorization, k = maximum fan-in

Page 6: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Inference

• Pr(S=1|W=1) = Pr(S=1,W=1)/Pr(W=1)

= 0.2781/0.6471

= 0.430

• Pr(R=1|W=1) = Pr(R=1,W=1)/Pr(W=1)

= 0.4581/0.6471

= 0.708

Page 7: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Explaining Away

• S and R “compete” to explain W=1

• S and R are conditionally dependent

• Pr(S=1|R=1,W=1) = 0.1945

Page 8: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Inference

where

where

Page 9: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Inference

• Variable elimination

• Choosing optimal ordering – NP hard

• Greedy methods work well

• Computing several marginals

• Dynamic programming avoids redundant computation

• Sound familiar ??

Page 10: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Bayes Balls for Conditional Independence

Page 11: Introduction to Graphical Models Brookes Vision Lab Reading Group.

A Unifying (Re)View

Linear GaussianModel (LGM)

FA SPCA PCA LDS

Mixture of Gaussians VQ HMM

Continuous-State LGM

Basic Model

Discrete-State LGM

Page 12: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Basic Model● State of a system is a k-vector x (unobserved)● Output of a system is a p-vector y (observed) ● Often k << p

● Basic model ● xt+1 = A xt + w● yt = C xt + v

● A is the k x k transition matrix● C is a p x k observation matrix● w = N(0, Q)● v = N(0, R)

● Noise processes are essential

Zero mean w.l.o.g

Page 13: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Degeneracy in Basic Model

• Structure in Q can be moved to A and C• W.l.o.g. Q = I• R cannot be restricted as yt are observed• Components of x can be reordered arbitrarily.• Ordering is based on norms of columns of C.• x1 = N(µ1, Q1)• A and C are assumed to have rank k.• Q, R, Q1 are assumed to be full rank.

Page 14: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Probability Computation

• P( xt+1 | xt ) = N(A xt, Q ; xt+1)

• P( yt | xt ) = N( C xt, R; yt)

• P({x1,..,xT,{y1,..,yT}) =

P(x1) П P(xt+1|xtП P(yt|xt)

• Negative log probability

Page 15: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Inference● Given model parameters {A, C, Q, R, µ1, Q1}● Given observations y● What can be infered about hidden states x ?● Total likelihood

● Filtering : P (x(t) | {y(1), ... , y(t)})● Smoothing: P (x(t) | {y(1), ... , y(T)})● Partial smoothing: P (x(t) | {y(1), ... , y(t+t')})● Partial prediction: P (x(t) | {y(1), ... , y(t-t')})● Intermediate values of recursive methods for computing total likelihood.

Page 16: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Learning• Unknown parameters {A, C, Q, R, µ1, Q1}

• Given observations y• Log-likelihood

F(Q,Ө) – free energy

Page 17: Introduction to Graphical Models Brookes Vision Lab Reading Group.

EM algorithm

• Alternate between maximizing F(Q,Ө) w.r.t. Q and Ө.

• F = L at the beginning of M-step• E-step does not change Ө• Therefore, likelihood does not decrease.

Page 18: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Continuous-State LGM

Continuous-State LGM

Static Data Modeling Time-series Modeling

● No temporal dependence ● Factor analysis● SPCA● PCA

● Time ordering of data crucial● LDS (Kalman filter models)

Page 19: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Static Data Modelling

• A = 0• x = w• y = C x + v

• x1 = N(0,Q)

• y = N(0, CQC'+R)• Degeneracy in model• Learning : EM

– R restricted

• Inference

Page 20: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Factor Analysis

• Restrict R to be diagonal.• Q = I• x – factors• C – factor loading matrix• R – uniqueness• Learning – EM , quasi-Newton optimization• Inference

Page 21: Introduction to Graphical Models Brookes Vision Lab Reading Group.

SPCA

• R = єI• є – global noise level• Columns of C span the principal subspace.• Learning – EM algorithm• Inference

Page 22: Introduction to Graphical Models Brookes Vision Lab Reading Group.

PCA

• R = lim є->0 єI

• Learning– Diagonalize sample covariance of data– Leading k eigenvalues and eigenvectors define C– EM determines leading eigenvectors without

diagonalization

• Inference– Noise becomes infinitesimal– Posterior collapses to a single point

Page 23: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Linear Dynamical Systems

• Inference – Kalman filter

• Smoothing – RTS recursions

• Learning – EM algorithm– C known – Shumway and Stoffer, 1982– All unknown – Ghahramani and Hinton, 1995

Page 24: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Discrete-State LGM

• xt+1 = WTA[A xt + w]

• yt = C xt + v

• x1 = WTA[N(µ1,Q1)]

Page 25: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Discrete-State LGM

Discrete-state LGM

Static Data Modeling Time-series Modeling

● Mixture of Gaussians● VQ

● HMM

Page 26: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Static Data Modelling

• A = 0• x = WTA[w]• w = N(µ,Q)• Y = C x + v

• лj = P(x = ej)

• Nonzero µ for nonuniform лj

• y = N(Cj, R)

• Cj – jth column of C

Page 27: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Mixture of Gaussians• Mixing coefficients of cluster лj

• Mean – columns Cj

• Variance – R

• Learning: EM (corresponds to ML competitive learning)

• Inference

Page 28: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Vector Quantization

• Observation noise becomes infinitesimal

• Inference problem solved by 1NN rule

• Euclidean distance for diagonal R

• Mahalanobis distance for unscaled R

• Posterior collapses to closest cluster

• Learning with EM = batch version of k-means

Page 29: Introduction to Graphical Models Brookes Vision Lab Reading Group.

Time-series modelling

Page 30: Introduction to Graphical Models Brookes Vision Lab Reading Group.

HMM

• Transition matrix T

• Ti,j = P(xt+1 = ej | xt = ei)

• For every T, there exist A and Q

• Filtering : forward recursions

• Smoothing: forward-backward algorithm

• Learning: EM (called Baum-Welsh reestimation)

• MAP state sequences - Viterbi