A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit...

21
A central limit theorem for an omnibus embedding of random dot product graphs Keith Levin 1 with Avanti Athreya 2 , Minh Tang 2 , Vince Lyzinski 3 and Carey E. Priebe 2 1 University of Michigan, 2 Johns Hopkins University, 3 University of Massachusetts Amherst November 18, 2017

Transcript of A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit...

Page 1: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

A central limit theorem for an omnibus embedding ofrandom dot product graphs

Keith Levin1

with Avanti Athreya2, Minh Tang2, Vince Lyzinski3 and Carey E. Priebe2

1University of Michigan, 2Johns Hopkins University, 3University of Massachusetts Amherst

November 18, 2017

Page 2: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Classical two-sample hypothesis testing

Well-studied in statistics (indeed, the only thing we teach undergrads?)

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 2 / 20

Page 3: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Graph Hypothesis Testing

Q: how to tell if two (or more) graphs are from the same distribution?

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 3 / 20

Page 4: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Random Dot Product Graph(RDPG; Young and Scheinerman, 2007)

Extends stochastic block model (SBM)Vertices assigned latent positions

drawn i.i.d. from d-dimensional distribution FF constrained so that 0 ≤ xT y ≤ 1 whenever x, y ∈ supp FDenote i-th latent position by Xi ∈ Rd

Edges {i, j} present or absent independently with probability XTi Xj .

Collect latent positions in rows of X ∈ Rn×d .

Warning: Non-identifiabilityModel specified only up to orthogonal rotation of latent positions.

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 4 / 20

Page 5: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Random Dot Product Graph(RDPG; Young and Scheinerman, 2007)

Extends stochastic block model (SBM)Vertices assigned latent positions

drawn i.i.d. from d-dimensional distribution FF constrained so that 0 ≤ xT y ≤ 1 whenever x, y ∈ supp F .Denote i-th latent position by Xi

Edges {i, j} present or absent independently with probability XTi Xj .

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 5 / 20

Page 6: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Estimating latent positions:adjacency spectral embedding (Sussman et al, 2012)

Definition (Adjacency Spectral Embedding (ASE))

Given adjacency matrix A , embed vertices of A = USUT into Rd as rowsof X = UdS1/2

d ∈ Rn×d , where Ud denotes first d columns of U, Sd denotestruncation of S to top d eigenvalues.

Under RDPG, ∃W : max1≤i≤n ‖Xi −WXi‖ = OP(n−1/2 log n).

Lyzinski, et al (2014): ASE yields a.a.s. perfect recovery of blockmemberships in SBM

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 6 / 20

Page 7: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

RDPG: what do we mean by same distribution?

Option 1: Test if latent positions are drawn from same distribution.

G1 positions drawn i.i.d. F1, G2 positions drawn i.i.d. F2

Test if F1 = F2

“Nonparametric” testing

Tang, Athreya, Sussman, Lyzinski and Priebe (2017)Estimate latent positions of G1 and G2 via ASE, apply maximum meandiscrepancy (Gretton et al, 2012) to ASE estimates.

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 7 / 20

Page 8: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

RDPG: what do we mean by same distribution?

Option 1: Test if latent positions are drawn from same distribution.

G1 positions drawn i.i.d. F1, G2 positions drawn i.i.d. F2

Test if F1 = F2

“Nonparametric” testing

Tang, Athreya, Sussman, Lyzinski and Priebe (2017)Estimate latent positions of G1 and G2 via ASE, apply maximum meandiscrepancy (Gretton et al, 2012) to ASE estimates.

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 7 / 20

Page 9: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

RDPG: what do we mean by same distribution?

Option 2: Test if latent positions are the same

G1 latent positions X ∈ Rn×d , G2 latent positions Y ∈ Rn×d

Test if X = YW for some unitary W .

“Semiparametric” testing

Tang, Athreya, Sussman, Lyzinski and Priebe (2015)Embed both graphs via ASE, align estimated positions via Procrustesanalysis (Gower, 1975). Reject H0 if alignment is poor, i.e., ifTProc = minW∈Ud ‖X − YW‖F is large.

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 8 / 20

Page 10: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Challenges in semiparametric graph testing

Problem 1: Procrustes alignment introduces variance

More variance⇒ less power.

Problem 2: How to generalize to multiple-graph hypothesis testing?

Ultimately, we want something like ANOVA for graphs.

Goal: develop a technique that...1 Avoids Procrustes alignment2 Generalizes naturally to 3 or more graphs

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 9 / 20

Page 11: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Omnibus matrix: motivation

Definition (Omnibus matrix)Let graphs G1 and G2 be d-dimensional RDPGs with adjacency matricesA (1) and A (2). We construct an omnibus matrix for the graphs as

M =

A (1) A (1)+A (2)

2A (1)+A (2)

2 A (2)

∈ R2n×2n

Note: generalizes naturally to m graphs, with (i, j)-block (A (i) + A (j))/2.

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 10 / 20

Page 12: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Omnibus embedding

Reminder

M =

A (1) A (1)+A (2)

2A (1)+A (2)

2 A (2)

∈ R2n×2n

Under H0, we have EA (1) = EA (2) = XXT = P = UPSPUTP

SP ∈ Rd×d diagonal, UP ∈ R

n×d orthonormal columns

EM = P =

[P PP P

]=

[UU

]SP

[UT UT

]=

[XX

] [XT XT

]= UPSPUT

P.

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 11 / 20

Page 13: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Omnibus embedding

Under H0, we have EA (1) = EA (2) = XXT = P = UPSPUTP

SP ∈ Rd×d diagonal, UP ∈ R

n×d orthonormal columns

EM = P =

[P PP P

]=

[UU

]SP

[UT UT

]=

[XX

] [XT XT

]= UPSPUT

P.

Key pointApplying ASE to M, we get a 2n-by-d matrix,

Z =

[XY

],

X , Y ∈ Rn×d provide estimates of latent positions of G1, G2, in the samed-dimensional space without additional alignment step. Natural teststatistic given by TOmni = ‖X − Y‖F .

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 12 / 20

Page 14: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Main results: Notational preliminaries

In what follows, we assume the null hypothesis

So G1 and G2 have shared latent positions X ∈ Rn×d .

EA (1) = EA (2) = P = UPSPUTP = XXT ∈ Rn×n

We denote the “true latent positions” of M by

Z =

[XX

]=

[UP

UP

]S1/2

P = UPS1/2P∈ R2n×d

and their estimates by

Z = UMS1/2M =

[XY

]∈ R2n×d

where SM ∈ Rd×d is the diagonal matrix of the top d eigenvalues of M

and corresponding eigenvectors in columns of UM ∈ R2n×d .

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 13 / 20

Page 15: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Main results: Concentration inequality

Lemma (Uniform concentration of estimates)

Let {A (i)}mi=1 be adjacency matrices of m independent RDPGs with sharedlatent positions X = UPS1/2

P ∈ Rn×d and let M ∈ Rmn×mn be their omnibusmatrix with top eigenvalues collected in diagonal matrix SM ∈ R

d×d andcorresponding eigenvalues in the columns of UM ∈ R

mn×d . There exists aconstant C > 0 such that with high probability, there exists an orthogonalmatrix W ∈ Rd×d such that

max1≤h≤mn

‖(UMS1/2M − UPS1/2

PW)h,·‖ ≤

Cm1/2 log mn√

n.

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 14 / 20

Page 16: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Main results: CLT

Theorem (CLT: informally)

Let {A (i)}mi=1 be adjacency matrices of m independent RDPGs with sharedlatent positions X = UPS1/2

P ∈ Rn×d drawn i.i.d. from d-dimensionaldistribution F. Let M ∈ Rmn×mn be their omnibus matrix with topeigenvalues collected in diagonal matrix SM ∈ R

d×d and correspondingeigenvalues in the columns of UM ∈ R

mn×d . Fix h = m(s − 1) + i for i ∈ [n]and s ∈ [m]. Then the error between the h-th position estimate and the(properly rotated) true h-th position is asymptotically a continuous mixtureof normals, with mixing determined by F.

n1/2(UMS1/2M − UPS1/2

PWn)h,· →

∫N(0,Σ(y))dF(y).

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 15 / 20

Page 17: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Main results: CLT

Theorem (CLT: More formally)

Let {A (i)}mi=1 be adjacency matrices of m independent RDPGs with sharedlatent positions X = UPS1/2

P ∈ Rn×d drawn i.i.d. from d-dimensionaldistribution F. Let M ∈ Rmn×mn be their omnibus matrix with topeigenvalues collected in diagonal matrix SM ∈ R

d×d and correspondingeigenvalues in the columns of UM ∈ R

mn×d . Let Φ(x,Σ) denote the cdf of amultivariate Gaussian with mean 0 and covariance matrix Σ. Fixh = m(s − 1) + i for i ∈ [n] and s ∈ [m]. There exists a sequence of d-by-dorthogonal matrices (Wn)∞n=1 such that for all x ∈ Rd ,

limn→∞

Pr[n1/2(UMS1/2

M − UPS1/2P

Wn)h,· ≤ x]

=

∫Φ (x,Σ(y)) dF(y),

where Σ(y) = (m + 3)∆−1Σ(y)∆−1/(4m) and

∆ = EFX1XT1 , Σ(y) = EF (yT X1 − (yT X1)2)X1XT

1 .

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 16 / 20

Page 18: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Experiments: hypothesis testing

●● ●●

●●● ●

0.00

0.25

0.50

0.75

1.00

0 250 500 750 1000Number of vertices (log scale)

Em

piric

al P

ower

Method●

Omnibus

Procrustes

(a)

●●

●●

●●

0.00

0.25

0.50

0.75

1.00

0 250 500 750 1000Number of vertices (log scale)

Em

piric

al P

ower

Method●

Omnibus

Procrustes

(b)

●●

● ●

●● ●

0.00

0.25

0.50

0.75

1.00

0 250 500 750 1000Number of vertices (log scale)

Em

piric

al P

ower

Method●

Omnibus

Procrustes

(c)

Figure: Power of the Procrustes-based (blue) and omnibus-based (green) tests todetect when the two graphs being testing differ in (a) one, (b) five, and (c) ten oftheir latent positions. Each point is the proportion of 1000 trials for which thegiven technique correctly rejected the null hypothesis. Error bars denote twostandard errors of this empirical mean.

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 17 / 20

Page 19: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Experiments: estimating latent positions

● ●

● ● ● ● ●

● ●

● ● ● ● ●

●●

● ● ● ● ●

●●

● ● ● ● ●

● ●

● ● ● ● ●

10

20 30 50 80 100 200 300 500 8001000Number of vertices (log scale)

Mea

n S

quar

ed E

rror

(lo

g sc

ale)

Method●

Abar

ASE1

OMNI

OMNIbar

PROCbar

Figure: Mean squared error (MSE) in recovery of latent positions (up to rotation)in a 2-graph RDPG model as a function of the number of vertices for differentestimation procedures.

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 18 / 20

Page 20: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Future Work

Develop graph analogues of ANOVA and other multiple hypothesistesting procedures

Improve techniques for choosing critical value in omnibus test

Improve understanding of power under HA

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 19 / 20

Page 21: A central limit theorem for an omnibus embedding …dml.cs.byu.edu/icdm17ws/Keith.pdfA central limit theorem for an omnibus embedding of random dot product graphs Keith Levin1 with

Thanks!Full paper: https://arxiv.org/abs/1705.09355

K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, 2017 20 / 20