Random Networks, Graphical Models and...

Transcript of Random Networks, Graphical Models and...

ExchangeabilityA finite deFinetti theorem

Bidirected graphical models

Random Networks, Graphical Models and Exchangeability

Alessandro RinaldoCarnegie Mellon University

joint work with Steffen Lauritzen and Kayvan Sadeghi

October 4, 2015AMS Central Fall Sectional Meeting

Loyola University

Special Session on Algebraic Statistics and its Interactions with Combinatorics,Computation, and Network Science

A. Rinaldo Random Networks, Exchangeability and Graphical Models 1/21


Outline

Exchangeability of (infinite) networks.

A finite deFinetti theorem and the dissociated property.

Exchangeable and extendable finite networks are (mixtures of)bidirected graphical models.



Statistical Network (Random Graph) Analysis

Let Ln be the set of simple labeled graphs on n nodes: ∣Ln∣ = 2(n2).

The nodes represent agents in some population of interest and theedges encode the relationships among them.

Statistical Network Analysis

Pose and estimate probability distributions on Lnby modeling the joint occurrence of the (n2) random edges.



Motivation: asymptotics of networks

Let L = ⋃n L, be the set of all finite (labeled, simple) graphs.A statistical model for L is a sequence {pn}n∈N of probabilitydistributions, where pn is a probability distribution on Ln.

For n < m, let pnm denote the marginal of pm over Ln.

Consistency and Extendability

A statistical model {pn}n∈N on L is consistent when, for any pair n < m,

pn = pnm. (1)

A probability distribution pn on Ln is extendable when (1) holds ∀m > n.

Most network models are not consistent!



Motivation: asymptotics of networks

Let L = ⋃n L, be the set of all finite (labeled, simple) graphs.A statistical model for L is a sequence {pn}n∈N of probabilitydistributions, where pn is a probability distribution on Ln.

For n < m, let pnm denote the marginal of pm over Ln.

Consistency and Extendability

A statistical model {pn}n∈N on L is consistent when, for any pair n < m,

pn = pnm. (1)

A probability distribution pn on Ln is extendable when (1) holds ∀m > n.

Most network models are not consistent!



Consistency via Exchangeability

Let L∞ be the set of (countably) infinite lableled, simple graphs.Every probability distrbution on L∞ trivially specifies one consistentmodel!We impose one further restriction...

Exchangeability

A probability distribution on L∞ is exchangeable when all finite isomorphicgraphs have the same probabilities.

Exchangeability is a most basic form of invariance, suitable to describethe "shape" of networks (large scale property).

Labeled vs unlabeled. The exchangeability assumption is equivalent todefine models on Un, the set of unlabaled graohs on n nodes, for all n.





Exchangeability








Exchangeability






Exchangeability and graphons

Analytic representation of exchangeable distributions

The set of exchangeable distributions, with the topology of weakconvergence, is a (Bauer) simplex. Denote its extreme points with E∞.

p∞ ∈ E∞ if and only if, for every n and G ∈ Ln

pn∞(G) = ∫[0,1]n

∏(i,j)∈E(G)

f (zi , zj) ∏(i,j)/∈E(G)

(1 − f (zi , zj))dz1 . . . zn,

where f ∶ [0,1]2 → [0,1] is a (measurable) symmetric function, called agraphon.

Graphons are unique up to measure preserving transformations of [0,1].

Vast literature: Aldous, Hoover, Kallenberg, Diaconis and Freedman,Chayes, Borgs and Lovász, etc ect...

Key point: only the finite marginals of p∞ ∈ E∞ can be realized. Generalexchangeable models are mixtures of such distributions.



Exchangeability and graphons

Analytic representation of exchangeable distributions

The set of exchangeable distributions, with the topology of weakconvergence, is a (Bauer) simplex. Denote its extreme points with E∞.

p∞ ∈ E∞ if and only if, for every n and G ∈ Ln

pn∞(G) = ∫[0,1]n

∏(i,j)∈E(G)

f (zi , zj) ∏(i,j)/∈E(G)

(1 − f (zi , zj))dz1 . . . zn,

where f ∶ [0,1]2 → [0,1] is a (measurable) symmetric function, called agraphon.

Graphons are unique up to measure preserving transformations of [0,1].

Vast literature: Aldous, Hoover, Kallenberg, Diaconis and Freedman,Chayes, Borgs and Lovász, etc ect...

Key point: only the finite marginals of p∞ ∈ E∞ can be realized. Generalexchangeable models are mixtures of such distributions.



Graphons and homomorphism densities

For G ∈ Ln and H ∈ Lk with k ≤ n, the density homomorphism of H in G is

t(H,G) =∣hom(H,G)∣

nk.

Convergence of graph sequences = convergence of marginal probabilities

A sequence {Gn}n∈N converges if and only if, for some graphon f andeach H ∈ L with k nodes,

limn→∞

t(H,Gn) = ∫[0,1]k

∏(i,j)∈E(H)

f (zi , zj) dz1 . . . zk = P (H ⊆ G′) ,

G′ a random graph distributed like the pk∞, p∞ ∈ E∞ defined by f .

The sequence {t(H, f )}H∈L of density homomorphisms uniquelyspecifies p∞.



Graphons and homomorphism densities

For G ∈ Ln and H ∈ Lk with k ≤ n, the density homomorphism of H in G is

t(H,G) =∣hom(H,G)∣

nk.

Convergence of graph sequences = convergence of marginal probabilities

A sequence {Gn}n∈N converges if and only if, for some graphon f andeach H ∈ L with k nodes,

limn→∞

t(H,Gn) = ∫[0,1]k

∏(i,j)∈E(H)

f (zi , zj) dz1 . . . zk = P (H ⊆ G′) ,

G′ a random graph distributed like the pk∞, p∞ ∈ E∞ defined by f .

The sequence {t(H, f )}H∈L of density homomorphisms uniquelyspecifies p∞.



Finite Exchangeability

But real networks are finite! So, what can be said about the set Pn ofexchangeable distribution on Ln?

Finite exchangeability does not yield consistent models

Finite exchangeable probability distributions marginalize to but need not beextendable to exchangeable distributions.

Our goal

We would like to characterize the distributions in Pn that are extendable.We seek to establish a parametric (finite dimensional) representation of allthe distributions {pn∞,p∞ ∈ E∞}.







Our goal








Our goal




The Möius parametrization

It turns out it is convenient to work with maginal instead of jointprobabilities.

Möbius parameters

For any pn ∈ Pn, let zn the vector with entries indexed by subgraphs H of Knwithout isolated nodes of the form

zn(H) = P (H ⊆ Gn) ,

where Gn is the random graph with distribution pn. In particular, zn(∅) = 1.

Invertible linear transformation:

pn(G) = ∑H ∶E(H)⊇E(G)

(−1)E(H)−E(G)zn(H), ∀G ∈ Ln.

By exchangeability, zn(H) = zn(H ′) if H and H are isomorphic.



The Möius parametrization

It turns out it is convenient to work with maginal instead of jointprobabilities.

Möbius parameters

For any pn ∈ Pn, let zn the vector with entries indexed by subgraphs H of Knwithout isolated nodes of the form

zn(H) = P (H ⊆ Gn) ,

where Gn is the random graph with distribution pn. In particular, zn(∅) = 1.

Invertible linear transformation:

pn(G) = ∑H ∶E(H)⊇E(G)

(−1)E(H)−E(G)zn(H), ∀G ∈ Ln.

By exchangeability, zn(H) = zn(H ′) if H and H are isomorphic.



A finite deFinetti Theorem

We can describe now the relationships among Möbius parameters ofconsistent finitely exchangeable distributions.

A deFinetti’s theorem for finitely exchageable graphs

Assume m > n. Let pm an exchangeable distribution on Lm and znm theMöbius parameters corresponding to pnm. Then,

maxH

∣znm(H) − ∑G∈Gm

t(H,G)pm(G)∣ ≤ 1 −(m)nmn

.

A similar guarantee holds for the pn’s.

See also Matúš for more general statements.



The dissociated property

Corollary (The dissociated property)

If pn ∈ Pn is extendable to an (infinite) exchangeable distribution in E∞,then it satisfies the dissociated property:

zn∞(H) = zn(H) = zn(H1)zn(H2)

for all subgraphs H = H1 ⊎H2 of Kn without isolated nodes.

Extendable distributions in Pn are mixtures of dissociated distributions inPn.

A distribution p∞ on L∞ is in E∞ if and only if pn∞ satisfies thedissociated property for all n.

Thus, finitely exchangeable distribution must be dissociated in order tobe extendable to extremal distributions in E∞.

Result is not new, but derivation via finite exchangeability is.

...so what does dissociated distribution in Pn looks like?



The dissociated property

Corollary (The dissociated property)

If pn ∈ Pn is extendable to an (infinite) exchangeable distribution in E∞,then it satisfies the dissociated property:

zn∞(H) = zn(H) = zn(H1)zn(H2)

for all subgraphs H = H1 ⊎H2 of Kn without isolated nodes.

Extendable distributions in Pn are mixtures of dissociated distributions inPn.

A distribution p∞ on L∞ is in E∞ if and only if pn∞ satisfies thedissociated property for all n.

Thus, finitely exchangeable distribution must be dissociated in order tobe extendable to extremal distributions in E∞.

Result is not new, but derivation via finite exchangeability is.

...so what does dissociated distribution in Pn looks like?



Bidirected Graphical Models for Binary Data

Graphical models with bidirected edges, where the nodes of the graphrepresents the variables and lack of (bidirected) edges among nodessignify marginal independence among the corresponding variables.

See Richardson (2003), Drton and Richardson (2008) and Roverato,Luparelli and LaRocca (2013).

Global Markov property for bidirected (marginal) graphical models

A á B ∣ C when every path between A and B has a node outside A ∪B ∪C. Inparticular, C may be empty.




Example (Drton and Richardson, 2008)

X1 X2

X3X4

X1 X2

X3X4

In the undirected graph (left), the global Markov property expresses, e.g., that

X1 á X4 ∣ {X2,X3},

whereas in the bidirected graph (right) the global Markov property expresses,e.g., that

X1 á X4 {X1,X2} á X4 and X1 á X4 ∣ X3.




Bidirected Markov models arise, e.g., as marginals of directed Markovmodels with unobserved variables.

Example (by S. Lauritzen)

X1 X2 X3

X4

U12

U23

U24

In the graph above, the marginal distribution of (X1,X2,X3) will be bidirectedMarkov w.r.t. the graph

X1 X2 X3

X4



The canonical model for exchangeable and extendable networks

Dissociated property and bidirected graphical models

A distribution on Ln is dissociated if and only if it is Markov with respect to thebidirected line graph of Kn.

X12

X23 X24

X34

X13 X14

Contrast this with the Markov graphs of Frank and Strauss (1986) which areMarkov w.r.t. the undirected line graph.



The benfits of Möbius parametrization

Using the Möbius parameters are especially convenient because

are marginalizable: for any m > n

znm(H) = zn(H)

for any ubgraph H of Kn without isolated nodes.

expresses the bidirected Markov property in a simple way:

(From Drton and Richardson, 2008)

A distribution pn ∈ Pn is Markov with respect to the bidirected line graph of Knif and only if for any H = H1 ⊎H2 ⊎ . . . ⊎Hl ∈ Lk without isolated nodes,

zn(H) = zn(H1) ×⋯ × zn(Hl).



The Möbius parametrization

Polynomial parametrization

If a probability pn in Pn is extendable to some p∞ ∈ E∞, then

pn(G) = ∑U∈Un ∶E(G)⊆E(U)

(−1)E(U)−E(G)r(G,U) ∏C∈C(U)

zn(C), G ∈ Ln,

where D(U) denotes the maximal connected components of U and r(G,U)are the number of graphs in Ln that contain G as a subgraph and areisomorphic to U ∈ Un.

This defines a smooth parametrization, described by a smooth manifoldinside Pn specified by polynomial equations. Its dimension is the numberof connected subgraphs of all unlabeled graphs on n nodes.



The curved exponential family parametrization

Exponential parametrization


pn(G; ν) = exp⎧⎪⎪⎨⎪⎪⎩

∑U∈Un

νUs(U,G) − ψ(ν)⎫⎪⎪⎬⎪⎪⎭

, G ∈ Ln, ν ∈ V ⊂ R∣Un ∣−1,

where s(U,G) is the number of non-empty subgraphs of G isomorphic toU ∈ Un and ψ a normalizing constant.

Duality: the mean value parameters are (sums of) the Möbiusparameters.

The natural parameters ν are not free to vary, as they need to enforcethe dissociated property. These are defined implicitly!



The curved exponential family parametrization

Exponential parametrization


pn(G; ν) = exp⎧⎪⎪⎨⎪⎪⎩

∑U∈Un

νUs(U,G) − ψ(ν)⎫⎪⎪⎬⎪⎪⎭

, G ∈ Ln, ν ∈ V ⊂ R∣Un ∣−1,

where s(U,G) is the number of non-empty subgraphs of G isomorphic toU ∈ Un and ψ a normalizing constant.

Duality: the mean value parameters are (sums of) the Möbiusparameters.

The natural parameters ν are not free to vary, as they need to enforcethe dissociated property. These are defined implicitly!



Example

Suppose we observe the following graph G:

1 2 3

4

Under the assumed bidirected model, the likelihood under the Mobiüsparametrization is

p(G) = z⊵ − 2z + zK4 .

and under the curved exponential model is

P(G; ν) = exp{4ν− + 5ν∧ + ν∥ + ν△ + ν + 2ν⊓ + ν⊵ − ψ(ν)}.



Maximum Likelhood Estimation

Given a observation G ∈ Ln, the maximum likelihood estimator of pn isthe dissociated point in Pn with positive coordinates that maximizes thelikelihood of G.

Example 1

For the previous network, the MLE is

ẑ− = 1/2, ẑ∧ = 5/16, ẑ∥ = 1/4 ẑ△ = 3/16, ẑ = 3/16, ẑ⊓ = 1/8, ẑ⊵ = 1/16,

This estimate represents a mixture of the uniform distribution of allnetworks isomorphic to G (there are 12), and the empty network, withweights 3/4 and 1/16, respectively.

The MLE does not exist! In fact, we conjecture it never exists.




Given a observation G ∈ Ln, the maximum likelihood estimator of pn isthe dissociated point in Pn with positive coordinates that maximizes thelikelihood of G.

Example 1

For the previous network, the MLE is

ẑ− = 1/2, ẑ∧ = 5/16, ẑ∥ = 1/4 ẑ△ = 3/16, ẑ = 3/16, ẑ⊓ = 1/8, ẑ⊵ = 1/16,

This estimate represents a mixture of the uniform distribution of allnetworks isomorphic to G (there are 12), and the empty network, withweights 3/4 and 1/16, respectively.

The MLE does not exist! In fact, we conjecture it never exists.




Example 2

When the observed graph G is

1 2 3 4

the likelihood function is maximized for any value of λ satisfying0 ≤ λ ≤ 1/16 with

ẑ− = 1/2, ẑ∧ = 3/16, ẑ∥ = 1/4, ẑ△ = 1/16 − λ, ẑ = λ, ẑ⊓ = 1/16,

and all other z ’s equal to zero. This corresponds to a random networkthat has probability 3/4 of being isomorphic to G (12 cases) and theremaining probability mass of 1/4 is distributed arbitrarily between atriangle plus an isolated point (4 cases), and a 3-star (4 cases).

The MLE does not exist and is not unique!



An open problem

Does a dissociated exchenagble distribution on Ln always extend tosome p∞ ∈ E∞?

No! Example 1 shows this not the case. So the dissociated property isonly necessary for extendability.

Open problem

Let EPn ⊂ Pn the set of exchenagble and extendable distributions on Ln andDPn ⊂ Pn the distributions that are exchangeable and dissociated. Then

EPn ⊂ DPn.

What does DPn ∖ EPn look like?



An open problem

Does a dissociated exchenagble distribution on Ln always extend tosome p∞ ∈ E∞?

No! Example 1 shows this not the case. So the dissociated property isonly necessary for extendability.

Open problem

Let EPn ⊂ Pn the set of exchenagble and extendable distributions on Ln andDPn ⊂ Pn the distributions that are exchangeable and dissociated. Then

EPn ⊂ DPn.

What does DPn ∖ EPn look like?



More open problems...

What are the algebraic and geometric properties of the proposedbidirected model for networks?

How do we carry out maximum likelihood estimation in this curvedexponential family setting?


ExchangeabilityA finite deFinetti theoremBidirected graphical models

Random Networks, Graphical Models and...

Documents

Transcript of Random Networks, Graphical Models and...