Probabilistic Graphical Models - Caltech...

Probabilistic

Graphical Models

Lecture 2 – Bayesian Networks Representation

CS/CNS/EE 155

Andreas Krause

AnnouncementsWill meet in Steele 102 for now

Still looking for another 1-2 TAs..

Homework 1 will be out soon. Start early!! ☺

Multivariate distributionsInstead of random variable, have random vector

X(ω) = [X1(ω),…,Xn(ω)]

Specify P(X1=x1,…,Xn=xn)

Suppose all Xi are Bernoulli variables.

How many parameters do we need to specify?

Marginal distributionsSuppose we have joint distribution P(X1,…,Xn)

If all Xi binary: How many terms?

Rules for random variablesChain rule

Bayes’ rule

Key concept: Conditional independence

Events α, β conditionally independent given γ if

Random variables X and Y cond. indep. given Z if

for all x∈ Val(X), y∈ Val(Y), Z∈ Val(Z)

P(X = x, Y = y | Z = z) = P(X =x | Z = z) P(Y = y| Z= z)

If P(Y=y |Z=z)>0, that’s equivalent to

P(X = x | Z = z, Y = y) = P(X = x | Z = z)

Similarly for sets of random variables X, Y, Z

We write: P � X⊥ Y | Z6

Why is conditional independence useful?P(X1,…,Xn) = P(X1) P(X2 | X1) … P(Xn | X1,…,Xn-1)

How many parameters?

Now suppose X1 …Xi-1 ⊥ Xi+1… Xn | Xi for all i

P(X1,…,Xn) =

How many parameters?

Can we compute P(Xn) more efficiently?

Properties of Conditional Independence

Symmetry

X ⊥ Y | Z ⇒ Y ⊥ X | Z

Decomposition

X ⊥ Y,W | Z ⇒ X ⊥ Y | Z

Contraction

(X ⊥ Y | Z) Æ (X ⊥W | Y,Z) ⇒ X ⊥ Y,W | Z

Weak union

X ⊥ Y,W | Z ⇒ X ⊥ Y | Z,W

Intersection

(X ⊥ Y | Z,W) Æ (X ⊥W | Y,Z) ⇒ X ⊥ Y,W | Z

Holds only if distribution is positive, i.e., P>0

Key questionsHow do we specify distributions that satisfy particular

independence properties?

� Representation

How can we exploit independence properties for

efficient computation?

� Inference

How can we identify independence properties

present in data?

� Learning

Will now see example: Bayesian Networks

Key ideaConditional parameterization

(instead of joint parameterization)

For each RV, specify P(Xi | XA) for set XA of RVs

Then use chain rule to get joint parametrization

Have to be careful to guarantee legal distribution…

Example: 2 variables

Example: 3 variables

Example: Naïve Bayes modelsClass variable Y

Evidence variables X1,…,Xn

Assume that XA⊥ XB | Y

for all subsets XA,XB of {X1,…,Xn}

Conditional parametrization:

Specify P(Y)

Specify P(Xi | Y)

Joint distribution

Today: Bayesian networksCompact representation of distributions over large

number of variables

(Often) allows efficient exact inference (computing

marginals, etc.)

HailFinder

56 vars

~ 3 states each

�~1026 terms

> 10.000 years

on Top

supercomputersJavaBayes applet

Causal parametrizationGraph with directed edges from (immediate) causes

to (immediate) effects

Earthquake Burglary

JohnCalls MaryCalls

Bayesian networksA Bayesian network structure is a directed, acyclic graph G, where each vertex s of G is interpreted as a random variable Xs (with unspecified distribution)

A Bayesian network (G,P) consists of

A BN structure G and ..

..a set of conditional probability distributions (CPTs)P(Xs | PaXs

), where PaXsare the parents of node Xs such that

(G,P) defines joint distribution

Bayesian networksCan every probability distribution be described by a BN?

Representing the world using BNs

Want to make sure that I(P) ⊆ I(P’)

Need to understand CI properties of BN (G,P)

True distribution P’

with cond. ind. I(P’) Bayes net (G,P)

with I(P)

represent

Which kind of CI does a BN imply?

Local Markov AssumptionEach BN Structure G is associated with the following

conditional independence assumptions

X ⊥ NonDescendentsX | PaX

We write Iloc(G) for these conditional independences

Suppose (G,P) is a Bayesian network representing P

Does it hold that Iloc(G) ⊆ I(P)?

If this holds, we say G is an I-map for P.

Factorization Theorem

Iloc(G) ⊆ I(P)

True distribution P

can be represented exactly as

G is an I-map of P

(independence map)

i.e., P can be represented as

a Bayes net (G,P)

Iloc(G) ⊆ I(P)

G is an I-map of P

(independence map)

True distribution P

a Bayes net (G,P)

Proof: I-Map to factorization

Iloc(G) ⊆ I(P)

G is an I-map of P

(independence map)

True distribution P

a Bayes net (G,P)

The general case

Iloc(G) ⊆ I(P)True distribution P

Bayesian network (G,P)G is an I-map of P

(independence map)

Defining a Bayes NetGiven random variables and known conditional

independences

Pick ordering X1,…,Xn of the variables

For each Xi

Find minimal subset A ⊆{X1,…,Xi-1} such that Xi ⊥ X¬A | A,

where ¬A = {X1,…,Xn} \ A

Specify / learn CPD(Xi | A)

Ordering matters a lot for compactness of

representation! More later this course.

Adding edges doesn’t hurtTheorem:

Let G be an I-Map for P, and G’ be derived from G by

adding an edge. Then G’ is an I-Map of P

(G’ is strictly more expressive than G)

Additional conditional independencies

BN specifies joint distribution through conditional

parameterization that satisfies Local Markov Property

But we also talked about additional properties of CI

Weak Union, Intersection, Contraction, …

Which additional CI does a particular BN specify?

All CI that can be derived through algebraic operations

What you need to knowBayesian networks

Local Markov property

I-Maps

TasksSubscribe to Mailing list

https://utils.its.caltech.edu/mailman/listinfo/cs155

Read Koller & Friedman Chapter 3.1-3.3

Form groups and think about class projects. If you

have difficulty finding a group, email Pete Trautman

Probabilistic Graphical Models - Caltech...

Documents

Transcript of Probabilistic Graphical Models - Caltech...

Probabilistic Graphical Modelscourses.cms.caltech.edu/cs155/slides/cs155-13-approximate-marke… · Probabilistic Graphical Models Lecture 13 –Loopy Belief Propagation CS/CNS/EE

Probabilistic Graphical Modelsepxing/Class/10708-17/slides/lecture10...School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Isingmodels: modeling

LEARNING HIDDEN VARIABLES IN PROBABILISTIC GRAPHICAL MODELSpluto.huji.ac.il/~galelidan/papers/ElidanThesis.pdf · · 2014-02-02LEARNING HIDDEN VARIABLES IN PROBABILISTIC GRAPHICAL

Probabilistic Graphical Modelsepxing/Class/10708-14/... · Logistics Text books: Daphne Koller and Nir Friedman, Probabilistic Graphical Models M. I. Jordan, An Introduction to Probabilistic

Undirected Probabilistic Graphical Models (Markov Nets)

Introduction to Probabilistic Graphical Models · 2019-05-07 · This tutorial provides an introduction to probabilistic graphical models. We review three rep-resentations of probabilistic

Probabilistic Graphical Models - cs.cmu.eduepxing/Class/10708/lectures/lecture3-MRFrepresentation.pdf · School of Computer Science Probabilistic Graphical Models Representation of

Probabilistic graphical models...Probabilistic graphical models • The basic technical concepts behind probabilistic graphical models and how to work with them. • Applications in

Introduction to Probabilistic Graphical Models · This tutorial provides an introduction to probabilistic graphical models. We review three rep-resentations of probabilistic graphical

Probabilistic Graphical Modelsxiaowei/ai_materials/23-PGM...Probabilistic graphical model action, e.g., planning learning sampling knowledge inference What are Graphical Models? Model

Probabilistic Graphical Models - Radboud Universiteit · Probabilistic graphical models (PGMs) ... – AssignmentI ImplementaBayesiannetworkforareal-worlddomain. ... :437–48,2014.

Probabilistic Graphical Models - Brown Universitycs.brown.edu/.../2013-03-05_gaussianGraphicalModels.pdf2013/03/05 · Probabilistic Graphical Models Brown University CSCI 2950-P,

Probabilistic Graphical Models - People

Variational Mean Field for Graphical Modelscourses.cms.caltech.edu › cs155 › slides › cs155-14-variation...Variational Mean Field for Graphical Models CS/CNS/EE 155 Baback Moghaddam

Probabilistic Graphical Modelsperso.univ-lyon1.fr › alexandre.aussem › cours › PGM_cours.pdf · 2017-11-03 · Probabilistic Graphical Models Independence Models Conditional

Probabilistic Graphical Modelsassets.disi.unitn.it/uploads/doctoral_school/... · Introduction Probabilistic Graphical Models Need: probabilistic models that can represent models

Conditional Random Fields - A probabilistic graphical model Stefan Mutter Machine Learning Group Conditional Random Fields - A probabilistic graphical.

PROBABILISTIC GRAPHICAL MODELS - pudn.comread.pudn.com/downloads705/ebook/2833570/Probabilistic Graphica… · daphne koller and nir friedman probabilistic graphical models principles

Probabilistic inference in graphical modelsmlg.eng.cam.ac.uk/zoubin/course04/hbtnn2e-I.pdf · Probabilistic inference in graphical models ... inference algorithms allow statistical

Probabilistic Graphical Models - Perception Teamperception.inrialpes.fr/~Horaud/POP/TutorialsNOV06/PGM-forbes.pdf · Probabilistic Graphical Models (POP tutorial, Coimbra 2006) Florence