Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS...

61
Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. Caimo Inter Disciplinary Institute of Data Science, IDIDS Università della Svizzera italiana, USi, Lugano, Switzerland and Dipartimento di Scienza ed Alta Tecnologia, DISAT Università dell’Insubria, Como, Italy BoB, December 8, 2015 Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 1 / 54

Transcript of Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS...

Page 1: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Antonietta Mirajoint with

JP Onnela, S. Peluso, P. Muliere, A. Caimo

Inter Disciplinary Institute of Data Science, IDIDSUniversità della Svizzera italiana, USi, Lugano, Switzerland

andDipartimento di Scienza ed Alta Tecnologia, DISAT

Università dell’Insubria, Como, Italy

BoB, December 8, 2015

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 1 / 54

Page 2: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Summary

Inference in Statistical and Mechanistic Network Models

• Inference for Mechanistic Network Models (JP Onnela)NO LHD

• Inference for Statistical Network Models (S. Peluso + P. Muliere)AVAILABLE LHD

• Inference for Intractable Network Models via MCMC (A. Caimo)INTRACTABLE LHD

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 2 / 54

Page 3: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Motivation

• Systems of scientific and societal interest have large numbers of interactingcomponents

• Representation as networks: node = component, edge = interaction

• E.G.: Friendship/Advisory network, Citation network, Webpage link network

• Network models used to study:

• Social and contact networks: spread of pathogens, behaviors and information• Biological networks: gene regulation, signal transduction, protein interaction

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 3 / 54

Page 4: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Network Models

• Distinction between models of two things:

• Models of network structure (e.g, Erdos-Rényi )• Models of dynamical processes on networks (e.g., SI model)

• Why care about network structure?

• Interplay between network structure and the behavior of dynamicalprocesses on networks (e.g., hubs in epidemics)

• Models of network structure are useful for understanding how networkedstructures arise

• Models are often idealized and not intended to be realistic

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 4 / 54

Page 5: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Network Models

Distinction between two types of models of network structure:

• Statistical models (e.g., ERGM, Goyal-Blitzstein-DeGruttola)DATA DRIVEN

• Pros: inference on model parameters; hypothesis testing; model fitting• Cons: scalability; hard to incorporate domain knowledge

• Mechanistic models (e.g., Price model & Barabási-Albert model)KNOWLEDGE DRIVENassume that microscopic mechanisms that govern network formation andevolution are known, ask what happens if we apply these mechanisms repeatedly

• Pros: easy to incorporate domain knowledge, scalability• Cons: no inferential tools; no model comparison

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 5 / 54

Page 6: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Big Data

• Technology generates new types of data and new modeling challenges

Figure: Snowball sample of a mobile phone communication network of 7 million nodes and 23million ties. Adapted from: Onnela et al. PNAS 104, 7332 (2007).

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 6 / 54

Page 7: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Mechanistic Model of Weighted Social and Contact Networks

• Simple mechanistic model of social / contact networks (weighted)From the perspective of time expenditure:

• spend time with existing friends (a) / new friends (b, c)• Two mechanisms for forming ties (b, c)• Tie strength matters: choice of interaction partners & reinforcement of ties

• Some mechanistic models are analytically tractable, but simulation enablesconsideration of a much broader class of models

• One strength of a mechanistic model in that you can make causal inferences

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 7 / 54

Page 8: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Mechanistic Model of Weighted Social and Contact Networks

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 8 / 54

Page 9: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Approximate Bayesian Computation (ABC)

Starting point is Bayes’ theorem:

p(θ|D) =p(D|θ)p(θ)p(D)

• p(θ|D) = posterior

• p(D|θ) = likelihood

• p(θ) = prior

• p(D) = evidence (aka marginal LHD, prior predictive probability of the data)

Evaluation of the LHD may be computationally expensive or infeasible, ruling out bothlikelihood-based and posterior-based inference

Approximate Bayesian Computation (ABC) avoids direct evaluation of the LHD andapproximates it by generating synthetic data (synthetic observations) by directsimulation from the model

Sunnaker M et al. Approximate Bayesian computation. PLoS Comput Biol 9.1 (2013): e1002803.

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 9 / 54

Page 10: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Approximate Bayesian Computation (ABC)

• ABC rejection sampler is the simplest form of ABC

ABC rejection sampler

• Sample parameter θ∗ from the prior p(θ)

• Simulate dataset D∗ under the given model specified by θ∗

• Accept θ∗ if ρ(D∗, D) ≤ ε

• Distance measure ρ(D∗, D) determines the level of discrepancy between thesimulated data D∗ and the observed data D

• The accepted θ∗ are approximately distributed according to the desired posteriorand, crucially, obtained without the need of explicitly evaluating the LHD

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 10 / 54

Page 11: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Approximate Bayesian Computation (ABC)

• It may be unfeasible to compute the distance ρ(D∗, D) for high-dimensional data

• Lower dimensional summary statistic S(D) can be used to capture the relevantinformation in D

• Comparison is now carried out between S(D∗) and S(D) such that the proposedθ∗ is accepted if ρ(S(D∗), S(D)) ≤ ε• If the summary statistic S is sufficient with respect to the model parameter θ, thenS contains all information in D about θ (by definition of sufficiency), and using asummary statistic in place of the full dataset does not introduce any error

• For most models it may be impossible to find sufficient statistics S, in which caseapplication relevant summary statistics need to be used

• Use of non-sufficient summary statistics renders the approach only approximate

• Accuracy can be investigated for models with known LHD

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 11 / 54

Page 12: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

ABC for Mechanistic Network Models

• ABC + mechanistic network models = generic + sound inferential framework

ABC rejection sampler for mechanistic network models

• Observe an empirical graph G

• Set up mechanistic network model M

• Sample parameter θ∗ from the prior p(θ)

• Simulate graph G∗ from the mechanistic network model M using parameter θ∗

• Accept θ∗ if ρ(S(G∗), S(G)) ≤ ε using application relevant summaries S

• Some simple network summaries: degree sequence, k-stars, subgraph counts,centrality measures (betweenness, eigenvector, random walk, etc.), etc.

• Can use KNN to identify points in the space of summary statistics close to S(G)

• Future research: try SMC-ABC by Drovandi + Pettitt with ADS by Friel + Caimo

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 12 / 54

Page 13: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Exact ML inference for Erdos-Rényi (ER)

For the Erdos-Rényi model the LHD is available:

Xij = RV coding the state of the dyadp = P (Xij = 1)

N = number of nodesL = number of connected dyads (suff. stat.)

L(G|p) =∏i 6=j

P (Xij = Aij) =∏i6=j

pAij (1− p)1−Aij = pL(1− p)N(N−1)/2−L

MLE = p = L/[N(N − 1)/2] = proportion of connected dyads to all dyads

SE(p) =√p(1− p)/n

95% CI = p± 2SE(p)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 13 / 54

Page 14: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Approximate ML inference for ER

Simulate the observed ER graph G0 from a model with N = 100 and p = 0.05

Find p that max an approximate LHD obtained by simulating graphs from the model

Set up a grid of parameter values: 0, 0.005, 0.01, . . . , 1

For each grid point pi, i = 1, 2, . . . , n simulate data from the model S = 1000 timesFor each pi, this gives rise to

a sequence of graphs G1i , . . . , G

Si ,

a sequence of corresponding summary statistics S(G1i ), . . . , S(GSi ),

a sequence of distances d1i = ρ(S(G1i ), S(G0)), . . . , dSi = ρ(S(GSi ), S(G0))

Pool the distances d for all values of i and s, resulting in a collection of nS values totalThe smaller the distance, the closer the graph Gsi is to the observed graph G0 in thesense of the chosen summary statisticChoose a cutoff value d∗ = 10th percentileFor each grid point pi count the number of graphs Gsi for which the correspondingdistance dsi < d∗

Obtain an unnormalized step function approximation fi = 1S

∑Ss=1 1dsi<d

∗ to the LHDpAML = the value pi for which fi obtains its maximum ≈ 0.049

CI are obtained as in the ML case but using pAML in place of the exact estimate p

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 14 / 54

Page 15: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

0.02 0.03 0.04 0.05 0.06 0.07 0.08theta

0.0

0.2

0.4

0.6

0.8

1.0P(D

| t

heta

) [u

nnorm

aliz

ed]

true p = 0.05 - estimated pAML ≈ 0.049

The solid (smooth) line is the exact LHDWe set the threshold at the 10th percentile

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 15 / 54

Page 16: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Exact Bayesian inference for ER

Beta prior p(θ) = Beta(α = 5, β = 50)

Beta posterior p(θ|D) = Beta(α+ L, β + [N(N − 1)/2)− L])

Bayesian Estimator = posterior mean

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 16 / 54

Page 17: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Approximate Bayesian inference for Erdos-Rényi

Simulate the observed ER graph G0 from a model with N = 100 and p = 0.05

Number of edges = summary statistic = S(G)

We performed 100,000 draws from the prior of which only 162 were accepted.

High rejection rate is due to requiring S(Gi) = Li = L0 = S(G0)

The mean and median of the prior were The 0.091 and 0.086

the 95% credible interval (CI) is [0.031, 0.178]

The mean and median of the posterior were 0.050 and 0.050

the 95% credible interval (CI) is [0.043, 0.055]

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 17 / 54

Page 18: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Approximate Bayesian inference for Erdos-Rényi

Degree distribution = summary statistic = S(G)

KS = distance measure

We performed 100,000 draws from the prior of which only 761 were accepted.

Higher acceptance rate

The mean and median of the prior are 0.09 and 0.09

the 95% CI is [0.03, 0.18]

The mean and median of the posterior are 0.05, 0.05

the 95% CI is [0.04, 0.06]

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 18 / 54

Page 19: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Inference for Erdos-Rényi

0.00 0.05 0.10 0.15 0.20 0.25 0.30theta

0

20

40

60

80

100

120

140

P(t

heta

); P

(theta

| D

)

TRUE: density functions for the prior (solid green) and the posterior (solid red)Here the prior distribution is the beta distribution B(α = 5, β = 50).True value of the parameter is θ = p = 0.05 = black vertical lineESTIMATED Prior p(θ) (green) and posterior p(θ|D) (red)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19 / 54

Page 20: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Approximate Bayesian inference for Easley-Kleinberg (EK)

The EK model is a simple model of directed networksSimilar to the earlier models of Price and Barabási-AlbertUnlike the ER model, the EK model is a growing network model that grows by theaddition of new vertices to the graphNew vertices attach themselves to existing vertices using a simple attachment ruleThe concept of preferential attachment refers to the idea that in a model of networkgrowth, each new node is not equally likely to attach itself to any of the existing nodesbut, instead, has a preference for some nodes over others based on their degree (andpossibly other quantities).More specifically, linear preferential attachment specifies that an incoming node willattach to an existing vertex vi with probability proportional to its degree: Πi ∝ ki

More complicated functions of degree ki are certainly possible, and in general one canincorporate nodal attributes, or covariates, in the expression for the attachment functionRESULTS: Similar performance to the Easley-Kleinberg model

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 20 / 54

Page 21: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Evolving Gene Networks

• We apply the same inferential framework to real biological networks

• Gene duplication is one of the primary forces behind the evolution of genomes

• Gene duplication is a random event at the molecular level

• Usually the duplicated version of a gene is under less selective pressure than itsparent, and is free to mutate rapidly and could potentially take on a novel function

• Two copies of a gene created by a duplication event are known as paralogs

• Most paralogous pairs code for proteins that differ in structure and function

• Modeling assumption: duplication of a gene can be modeled by duplication of thecorresponding node on a network (entails making an identical copy of the originalnode and all of its connections)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 21 / 54

Page 22: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Evolving Gene Networks

• Duplication acts in concert with other processes, such as degenerative mutation,that can be incorporated in the network model

• Both copies of the duplicated gene are subject to degenerative mutations and losesome functions, but jointly they retain the full set of functions present in theancestral gene

Duplication-Mutation-Complementation (DMC) model of network growth

• Duplication: A node is selected at random and duplicated; an interaction of aprotein with its own copy may be added (dimerization)

• Divergence & Complementation: For each common neighbor of the chosennode and its duplicate, one of the two edges is chosen at random and removedwith given probability

A Vazquez, A Flammini, A Maritan, A Vespignani. Modeling of protein interaction networks. ComPlexUs 1:38-44(2003).

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 22 / 54

Page 23: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Inference for DMC

Figure: Prior and posterior draws for the DMC (duplication - mutation - complementation) model formodeling protein-protein interaction networks. True values (solid lines) and posterior means(dashed lines) are shown for each parameter.

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 23 / 54

Page 24: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Hypothesis Testing

H0 : θ > θ∗ VS H1 : θ ≤ θ∗ for some arbitrary θ∗

Bayesians compute P (H0|D) =∫∞θ∗ p(θ|D) dθ

The integral can be estimated by summing over a finite set of samples θt from theposterior resulting in the estimator P (H0|D) = 1

T

∑Tt=1 1θt>θ∗

For the ER model with N = 1000 and p = 0.25 testH0 : p > 0.3 VS H1 : p ≤ 0.3

10,000 draws from the prior of which 759 draws were acceptedThe 95% posterior CI was [0.194, 0.302]

And P (H0|G0) = P (p > 0.3|G0) ≈ 0.032

The posterior odds are defined as

P (H0|D)

P (H1|D)=P (H0|G0)

P (H1|G0)=

P (H0|G0)

1− P (H0|G0)≈

0.032

0.968≈ 0.033, (1)

suggesting that H1 is 1/0.033 = 30.25 i.e. over 30 times more likely than H0

We can confidently reject the null HP

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 24 / 54

Page 25: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Model Comparison

• Standard Bayesian approach to model comparison involves Bayes factors andposterior probabilities of model index

• Posterior probabilities are poorly estimated by ABC (theory and simulation)

• Alternative approach: select the most likely model using a classifier and computeABC approximation of the posterior predictive error rate

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 25 / 54

Page 26: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Model Comparison

ABC for model comparison (Part I)

• Observe an empirical graph G

• Set up multiple mechanistic network models M1 and M2

• Draw model index from the model prior: τ1 = P (M = 1) = P (M = 2) = τ2

• Draw parameter θ∗ from the prior p(θ|M)

• Simulate graph G∗ from the given mechanistic network model using parameter θ∗

• Accept θ∗ if ρ(S(G∗), S(G)) ≤ ε using any summaries S

ABC for model comparison (Part II)

• Draw from the ABC approximation of the joint posterior p(θ,M|D)

• Generate n independent pseudo-data sets for each such draw (ABCapproximation of the posterior predictive distribution)

• Compute posterior error rate using the random forest classifier (i.e., howfrequently it returns the true model index)

• Use a SUPER LEARNER approach

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 26 / 54

Page 27: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Conclusion

• Two distinct paradigms to the study of networked systems:

• Statistical modeling of networks (statistics)

• Network science (physics and applied mathematics)

• Statistical network science, a mixture of these two paradigms, appearspotentially promising in the era of big data

• Scientific fields remain intellectually vibrant by considering multiple approaches(not a one-size-fits-all approach)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 27 / 54

Page 28: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

PART II - From Mechanistic Networks to Nonparametric NetworkModels

• We propose a unified (statistical) modelling framework that theoretically justifiesthe main empirical regularities of the international trade network

• Each country is associated to a Polya urn whose composition controls thepropensity to trade with other countries

• The urn composition is updated through the walk of the Reinforced Urn Process ofMuliere et al. (2000)

• Different assumptions on reinforcement parameters account for empiricalregularities: degree distributions and strength distributions skewed to the right,negative assortativity, path-shortening and global sparsity

• Likelihood analytical derivation

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 28 / 54

Page 29: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Five constitutive properties

Five constitutive properties

(1) In line with preferential attachment scheme, the popularity of each vertex (country)is positively related to its in-strength (total imports)

(2) Barabasi and Albert (1999): single Polya urn attached to the network⇒ noweach country has an associated urn

(3) a Reinforced Urn Process moves from urn to urn (country to country)

• it reinforces the vertices with balls corresponding to the edges traversed• therefore reinforcing the bilateral trade relationships⇒ local preferential attachment: a country more or less attractive, dependingon trading partners

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 29 / 54

Page 30: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Five constitutive properties

Five constitutive properties

(4) power law trade flow distribution

• conditional export flow (weight) follows a Yule-Simon distribution• with decay rate in the right tail inversely proportional to other weights starting

from the same vertex⇒ competition among edges: export flows initiated by the same countrycompete with each other

(5) Barabasi and Albert (1999): generative network model⇒ specification of microdynamic laws, but no focus on the macroanalysis of thenetworkNow: likelihood-based inference + specification of the mechanism driving the RUP

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 30 / 54

Page 31: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Reinforced Urn Model Development

Directed weighted network: G = (V,E,w0)

V and E: respectively, the set of vertices and edgesw0 : E → R initial weights associated to each edge

{X} ∈ RUP (V, U, q): Reinforced Urn Process as introduced in Muliere et al. (2000), a randomwalk on Polya urnsV = {1, 2, . . . , n} is the vertex setU = {U1, . . . , Un} is a collection of Polya urnsUi = {w0

ij , j = 1, . . . , n} is urn associated to vertex i

w0ij : initial number of balls of colour j in Ui

q : V × V → V law of motion among the vertices:if Xt = v1, a ball is extracted from Uv1 and is replaced in Uv1 with s additional balls of the samecolour if the extracted ball is v2, then Xt+1 = q(v1, v2)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 31 / 54

Page 32: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Reinforced Urn Model Development

After m steps: updated edge weights of Wm = (wmij )

mij : number of times we observe (Xt, Xt+1) = (i, j) for t = 1, . . . ,m− 1

⇔ number of times the directed edge (i, j) is traversed

mi =∑

j mij : number of times vertex i is traversed by the RUP

P (Wm

= (wmij )) =

n∏i=1

B({wmij}

nj=1)

B({w0ij}nj=1)

B({·}nj=1): n-variate Beta function.

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 32 / 54

Page 33: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

International Trade Network Analysis

• Vertex degree (left) and strength (right) versus the correspondingupper tail probability for year 2000

• Straight lines suggest a right tail behaviour well approximated by the Pareto law

2.5 3.0 3.5 4.0 4.5 5.0

−5

−4

−3

−2

−1

0

log degree

log

uppe

r ta

il pr

ob

4 6 8 10 12 14

−5

−4

−3

−2

−1

0

log strength

log

uppe

r ta

il pr

ob

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 33 / 54

Page 34: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

International Trade Network Analysis

• Posterior summaries of the reinforcement parameter s, for different priordistributions

• Prior sensitivity study: differences in posterior densities are negligible

Prior Mean 95% interval Variance Skewness KurtosisU([0, 200]) 4.8485 [2.1295,5.8976] 1.0031 -0.8468 2.6729Exp(0.01) 4.8484 [2.1294,5.8976] 1.0032 -0.8466 2.6725Exp(0.05) 4.8480 [2.1293,5.8976] 1.0035 -0.8461 2.6708Exp(0.5) 4.8435 [2.1283,5.8970] 1.0074 -0.8397 2.6516Exp(1) 4.8384 [2.1271,5.8964] 1.0116 -0.8327 2.6305Exp(2) 4.8283 [2.1248,5.8952] 1.0200 -0.8184 2.5886

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 34 / 54

Page 35: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

International Trade Network Analysis

• Estimated international trade network country-specific reinforcements s

• Great majority of countries with reinforcement between 1 and 5

• Exceptions: USA (92), United Kingdom (45) and Argentina (42)

0 50 100 150

020

4060

80

estimated reinforcement

Countries

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 35 / 54

Page 36: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

International Trade Network Analysis

• Distribution of the differences between the actual and forecasted number ofpartnerships

• Peak around 0: those countries that do not change their set of trading partners

• A cluster of countries, mostly from Africa and Central America, considerablyincrease their exports

Number of importers

differences

Fre

quen

cy

−100 −50 0 50 100

010

2030

40

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 36 / 54

Page 37: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

International Trade Network Analysis

• Average forecast, out of 10000 predicted networks, of the network representingthe import-export relations among the Group of Eight countries• Edge widths reported proportional to the corresponding weights

Forecast G8 graph

USAJPN

GFR

FRN

UKG

ITA

CANRUS

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 37 / 54

Page 38: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Extensions

sij : reinforcement associated to balls in urn i that link country i with country j

sij = αi + βij∑k

(wjk + wkj) + γij∑k

(wik + wki)(wjk + wkj).

• negative value of βij ⇒ negative assortativity: countries with few connectionstend to link to highly-connected hubs

• positive γij ⇒ path-shortening: tendency to observe closed triangles, favoringpartnerships with countries having common partners

• edges with weight lower than ξ are removed, controlling for the global sparsity ofthe network

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 38 / 54

Page 39: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

PART III - MCMC inference for ERGM: Outline

• Exponential random graph models - ERGMs

• Bayesian exponential random graph models - BERGM

• Computational approaches for BERGMs

• Example

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 39 / 54

Page 40: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Exponential Random Graph Models - ERGMs

• The relational structure of an observed network y can be explained by the relativeprevalence of a set of overlapping sub-graph configurations s(y) called networkstatistics

• The likelihood of an ERGM represents the probability distribution of y given aparameter θ:

p(y|θ) = exp{θts(y)− γ(θ)}

• From a computational point of view we have an intractable likelihood problem

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 40 / 54

Page 41: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Exponential Random Graph Models - ERGMs

• The relational structure of an observed network y can be explained by the relativeprevalence of a set of overlapping sub-graph configurations s(y) called networkstatistics

• The likelihood of an ERGM represents the probability distribution of y given aparameter θ:

p(y|θ) = exp{θts(y)− γ(θ)}

• From a computational point of view we have an intractable likelihood problem

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 40 / 54

Page 42: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Exponential Random Graph Models - ERGMs

• The relational structure of an observed network y can be explained by the relativeprevalence of a set of overlapping sub-graph configurations s(y) called networkstatistics

• The likelihood of an ERGM represents the probability distribution of y given aparameter θ:

p(y|θ) = exp{θts(y)− γ(θ)}

• From a computational point of view we have an intractable likelihood problem

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 40 / 54

Page 43: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

Let’s define:

• qθ(y) = exp{θts(y)} unnormalised likelihood

• z(θ) = exp{γ(θ)} normalising constant

Metropolis-Hastings

0 CURRENT POSITION = θ

1 PROPOSE AN UPDATE θ′ ∼ h(·|θ)

2 ACCEPT MOVE FROM θ TO θ′ WITH PROBABILITY

1 ∧qθ′ (y)

qθ(y)

p(θ′)

p(θ)

h(θ|θ′)h(θ′|θ)

×z(θ)

z(θ′)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 41 / 54

Page 44: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

Approximate exchange algorithm (AEA)(Murray et al., 2006; Caimo and Friel, 2011)

0 CURRENT POSITION = θ

1 PROPOSE AN UPDATE OF (θ′, y′)

(i) Draw θ′ ∼ h(·|θ)(ii) Draw y′ ∼ p(·|θ′) via MCMC

2 EXCHANGE MOVE FROM θ TO θ′ WITH PROBABILITY

1 ∧qθ′ (y)

qθ(y)

p(θ′)

p(θ)

h(θ|θ′)h(θ′|θ)

qθ(y′)

qθ′ (y′)×z(θ)z(θ′)

z(θ)z(θ′)︸ ︷︷ ︸1

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 42 / 54

Page 45: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

Approximate exchange algorithm (AEA)(Murray et al., 2006; Caimo and Friel, 2011)

0 CURRENT POSITION = θ

1 PROPOSE AN UPDATE OF (θ′, y′)

(i) Draw θ′ ∼ h(·|θ)(ii) Draw y′ ∼ p(·|θ′) via MCMC

2 EXCHANGE MOVE FROM θ TO θ′ WITH PROBABILITY

1 ∧qθ′ (y)

qθ(y)

p(θ′)

p(θ)

h(θ|θ′)h(θ′|θ)

qθ(y′)

qθ′ (y′)×z(θ)z(θ′)

z(θ)z(θ′)︸ ︷︷ ︸1

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 42 / 54

Page 46: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

Computational challenges

• Posterior distribution p(θ|y) is difficult to sample from as ERGM parameters istypically very thin and highly correlated

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 43 / 54

Page 47: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

Improving chain mixing and convergence I(Caimo and Friel, 2011)

• Parallel adaptive direction sampling (ADS) for update of θ′ at step 1(i)

• Tie/No Tie (TNT) sampling to update y at step 1(ii)

Improving chain mixing and convergence II(Caimo and Mira, 2014)

• Adaptive strategies for update of θ′ at step 1(i)

• Approximate exchange algorithm with delayed rejection

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 44 / 54

Page 48: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

Improving chain mixing and convergence I(Caimo and Friel, 2011)

• Parallel adaptive direction sampling (ADS) for update of θ′ at step 1(i)

• Tie/No Tie (TNT) sampling to update y at step 1(ii)

Improving chain mixing and convergence II(Caimo and Mira, 2014)

• Adaptive strategies for update of θ′ at step 1(i)

• Approximate exchange algorithm with delayed rejection

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 44 / 54

Page 49: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

Adaptive strategies for update of θ′

(Roberts and Rosenthal, 2007; Haario et al., 2001)

• vertical adaptation: all particles at the current time for all chains

• horizontal adaptation: all past particles along the same chain

• rectangular adaptation: particles from all chains and all past simulations

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 45 / 54

Page 50: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

Adaptive exchange algorithm with delayed rejection

• First stage move:

α1(θ, θ′) = 1 ∧qθ′ (y)

qθ(y)

p(θ′)

p(θ)

h1(θ|θ′)h1(θ′|θ)

qθ(y′)

qθ′ (y′)

• If θ′ rejected, try a second stage move to θ′′ with probability

α2(θ, θ′, θ′′) =

1 ∧qθ(y′′) p(θ′′) h1(θ′|θ′′) [1− α1(θ′′, θ′)] h2(θ|θ′′, θ′) qθ′′ (y)

qθ(y) p(θ) h1(θ′|θ) [1− α1(θ, θ′)] h2(θ′′|θ, θ′) qθ′′ (y′)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 46 / 54

Page 51: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

Adaptive exchange algorithm with delayed rejection

• First stage move:

α1(θ, θ′) = 1 ∧qθ′ (y)

qθ(y)

p(θ′)

p(θ)

h1(θ|θ′)h1(θ′|θ)

qθ(y′)

qθ′ (y′)

• If θ′ rejected, try a second stage move to θ′′ with probability

α2(θ, θ′, θ′′) =

1 ∧qθ(y′′) p(θ′′) h1(θ′|θ′′) [1− α1(θ′′, θ′)] h2(θ|θ′′, θ′) qθ′′ (y)

qθ(y) p(θ) h1(θ′|θ) [1− α1(θ, θ′)] h2(θ′′|θ, θ′) qθ′′ (y′)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 46 / 54

Page 52: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Computational approaches

• A hierarchy of proposal distributions can be exploited

• Tennis-service strategy: “first bold” h1(·) proposal versus “second timid” h2(·)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 47 / 54

Page 53: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Example

12

3

4

5

6

7

8

910

11

12

13

1415

16

17

18

19

20

21

22

23

24 25

26

27 28

29

30

31

32

33 34

Zachary Karate Club Network: Friendship relations between 34 members of a karateclub at a US university in the 1970

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 48 / 54

Page 54: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Example

ERGM: p(y|θ) ∝ exp {θ1 s1(y) + θ2 s2(y, φu) + θ3 s3(y, φv)}

where:

s1(y) =∑i<j yij = number of edges

s2(y, φv) = eφv∑n−2i=1

{1−

(1− e−φv

)i}EPi(y)

geometrically weighted edgewise shared partners (gwesp)

s3(y, φu) = eφu∑n−1i=1

{1−

(1− e−φu

)i}Di(y)

geometrically weighted degrees (gwdegree)

Di(y) = degree distributionEPi(y) = distribution of the number of unordered pairs of connected nodes havingexactly k common neighbours

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 49 / 54

Page 55: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Example

ERGM: p(y|θ) ∝ exp {θ1 s1(y) + θ2 s2(y, φu) + θ3 s3(y, φv)}where:

s1(y) =∑i<j yij = number of edges

s2(y, φv) = eφv∑n−2i=1

{1−

(1− e−φv

)i}EPi(y)

geometrically weighted edgewise shared partners (gwesp)

s3(y, φu) = eφu∑n−1i=1

{1−

(1− e−φu

)i}Di(y)

geometrically weighted degrees (gwdegree)

Di(y) = degree distributionEPi(y) = distribution of the number of unordered pairs of connected nodes havingexactly k common neighbours

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 49 / 54

Page 56: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Example

Stat Comput

Table 3 Florentine marriage network—effective sample size (ESS) andperformance for each algorithm for model 7 based on 100 simulations

ADS-AEA AAEA-1 AAEA-2 AAEA-3

ESS 755 753 896 833

Performance(per sec)

33 27 38 28

ADS-AEA+ DR

AAEA-1+ DR

AAEA-2+ DR

AAEA-3+ DR

ESS 771 1,478 1,385 1,201

Performance(per sec)

33 33 41 34

Table 4 Florentine marriage network—Posterior correlation matrixbetween the parameters in the distribution for model 7

θ (1) θ (2) θ (3)

θ (1) 1.00 −0.94 −0.80

θ (2) – 1.00 −0.94

θ (3) – − 1.00

tions between 34 members of a karate club at a US universityin the 1970.

We propose to estimate the following 3-dimensionalmodel using the network statistics proposed by Snijders etal. (2006):

q(y|θ) = exp!θ (1)s1(y) + θ (2)v(y,φu) + θ (3)u(y,φv)

"

(14)

where

s1(y) =#

i< jyi j number of edges

v(y,φv) = eφv#n−2

i=1

!1 −

$1 − e−φv

%i"

E Pi (y)

geometrically weighted edgewise shared partners

(GWESP)

u(y,φu) = eφu#n−1

i=1

!1 −

$1 − e−φu

%i"

Di (y)

geometrically weighted degrees (GWD)

where E Pi (y) and Di (y) are the edgewise shared partnersand degree distributions respectively. We set φu = φv =log(2) so that the model is a non-curved ERGM (Hunter andHandcock 2006). The prior setting is the same as the one inSect. 3.3: p(θ) = N (0, 100I3). The tuning parameters forthe ADS proposal are: γ = 0.9 and ϵ = N (0, 0.0025Id) sothat the overall acceptance rate is around 21 %. The auxiliarychain consists of 100 iterations and a total number of 24,000main iterations is used. The number of chains used in the

Fig. 5 Zachary karate club network graph

Table 5 Zachary karate club network—Posterior parameter estimatesfor model 14

θ (1) (edges) θ (2) (gwesp) θ (3) (gwdegree)

ADS-AEA

Post. mean −3.51 0.74 1.18

Post. sd 0.62 0.21 1.12

AAEA-2+DR (horizontal adaptation + DR)

Post. mean −3.44 0.72 1.01

Post. sd 0.59 0.21 1.07

various strategies is the same as in the previous example inSect. 8.2.

In Table 5 are displayed the posterior parameter estimatesobtained using the ADS-AEA and AAEA-2 + DR. In thisexample, as happened in the teenage friendship networkabove, the AAEA-3 outperforms the AAEA-2 in terms ofvariance reduction of about 40 % but not in terms of per-formance. For this reason AAEA-2 is still to be preferred(Fig. 6).

In Fig. 7 it can be seen that the autocorrelations of theparameters for the AAEA-2 approach decay quicker thanthe autocorrelations given by the other methods as shown inFig. 6. The AAEA-2 outperforms the ADS-AEA of about12 % in terms of performance whereas the AAEA-2 + DRmakes a further improvement of about 20 % with respect tothe AAEA-2 + DR (see Table 6).

As in the Florentine marriage network example, we canobserve (Table 7) that there is a strong negative posteriorcorrelation between parameters θ (1) and θ (2) and betweenθ (1) and θ (3).

Generally a strong correlation between parameters inthe posterior distribution hampers the behaviour of vanillaMCMC schemes. In fact high posterior correlation can slowdown the motion of the chain towards equilibrium distribu-tion. It is in this case that the adaptive approximate exchangealgorithm with delayed rejection (AAEA-2 + DR) gives thebest performance compared to the adaptive direction sam-pling approximate exchange algorithm.

123

Author's personal copy

Estimated posterior means and standard deviations

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 50 / 54

Page 57: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Example

θ1 (edges)

-6 -5 -4 -3 -2 -1 0

0.00.10.20.30.40.5

0 5000 15000

-6-5

-4-3

-2-1

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

θ2 (gwesp.fixed.0.693147180559945)

-0.5 0.0 0.5 1.0 1.5

0.0

0.5

1.0

1.5

0 5000 15000

-0.5

0.0

0.5

1.0

1.5

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

θ3 (gwdegree)

-4 -2 0 2 4 6 8

0.00

0.10

0.20

0.30

0 5000 15000

-4-2

02

46

8

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

MCMC output for Model: y ~ edges + gwesp(log(2), fixed = TRUE) + gwdegree(log(2), fixed = TRUE)

θ1 (edges)

-6 -5 -4 -3 -2 -1 0

0.00.10.20.30.40.5

0 5000 15000

-6-5

-4-3

-2-1

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

θ2 (gwesp.fixed.0.693147180559945)

-0.5 0.0 0.5 1.0 1.5

0.0

0.5

1.0

1.5

0 5000 15000

-0.5

0.0

0.5

1.0

1.5

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

θ3 (gwdegree)

-4 -2 0 2 4 6 8

0.00

0.10

0.20

0.30

0 5000 15000

-4-2

02

46

8

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

MCMC output for Model: y ~ edges + gwesp(log(2), fixed = TRUE) + gwdegree(log(2), fixed = TRUE)

θ1 (edges)

-6 -5 -4 -3 -2

0.0

0.2

0.4

0.6

0 5000 15000

-6-5

-4-3

-2

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

θ2 (gwesp.fixed.0.693147180559945)

0.0 0.5 1.0 1.5

0.0

0.5

1.0

1.5

2.0

0 5000 15000

0.0

0.5

1.0

1.5

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

θ3 (gwdegree)

-2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

0 5000 15000

-20

24

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

MCMC output for Model: y ~ edges + gwesp(log(2), fixed = TRUE) + gwdegree(log(2), fixed = TRUE)

θ1 (edges)

-6 -5 -4 -3 -2

0.0

0.2

0.4

0.6

0 5000 15000

-6-5

-4-3

-2

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

θ2 (gwesp.fixed.0.693147180559945)

0.0 0.5 1.0 1.5

0.0

0.5

1.0

1.5

2.0

0 5000 15000

0.0

0.5

1.0

1.5

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

θ3 (gwdegree)

-2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

0 5000 15000

-20

24

Iterations

0 10 20 30 40 50

-1.0

-0.5

0.0

0.5

1.0

Lag

Autocorrelation

MCMC output for Model: y ~ edges + gwesp(log(2), fixed = TRUE) + gwdegree(log(2), fixed = TRUE)

Posterior density estimates for AEA (left) and AAEA+DR (right)

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 51 / 54

Page 58: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Example

• Effective sample size (ESS):AAEA+DR 70% bigger than AEA

• Performance (= ESS/CPU time):AAEA+DR 40% better than AEA

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 52 / 54

Page 59: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Example

• Effective sample size (ESS):AAEA+DR 70% bigger than AEA

• Performance (= ESS/CPU time):AAEA+DR 40% better than AEA

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 52 / 54

Page 60: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Example

18 Bergm: Bayesian Exponential Random Graphs in R

0.0

0.1

0.2

0.3

degree

prop

ortio

n of

nod

es

0 2 4 6 8 10 12 14 16 18

Bayesian goodness-of-fit diagnostics

0.0

0.2

0.4

0.6

0.8

minimum geodesic distance

prop

ortio

n of

dya

ds

1 2 3 4 5 6 7 8 9 NR

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

edge-wise shared partners

prop

ortio

n of

edg

es

0 2 4 6 8 10 12 14

Figure 9: Bayesian goodness-of-fit diagnostics.

SPORTnon regularregular

SMOKEnonoccasionalregular

DRUGSnon - once or twice a yearonce a month - once a week

SPORTnon regularregular

SMOKEnonoccasionalregular

DRUGSnon - once or twice a yearonce a month - once a week

SPORTnon regularregular

SMOKEnonoccasionalregular

DRUGSnon - once or twice a yearonce a month - once a week

Figure 10: 50 girls from the Teenage Friends and Lifestyle Study dataset.

Bayesian goodness-of-fit diagnostics: the observed network y is compared with a setof networks simulated from independent realisations of p(θ|y) in terms of high-levelnetwork statistics

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 53 / 54

Page 61: Antonietta Mira joint with JP Onnela, S. Peluso, P. Muliere, A. …€¦ · Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 19/54. Approximate

Conclusions

• NO LHD

• EXACT NON-PARAMETRIC LHD

• APPROXIMATE LHD

Antonietta Mira / IDIDS / USI Inference in Statistical and Mechanistic Network Models 54 / 54