A Theoretical Justification of Link Prediction Heuristics

1

A Theoretical Justification of Link Prediction Heuristics

Deepayan Chakrabarti ([email protected])

Purnamrita Sarkar Andrew Moore

Link Prediction

Which pair of nodes {i,j} should be connected?

Alice

Bob

Charlie

Goal: Recommend a movie

2

Link Prediction

Which pair of nodes {i,j} should be connected?

Goal: Suggest friends

3

Link Prediction Heuristics

Predict link between nodes Connected by the shortest path With the most common neighbors (length 2 paths) More weight to low-degree common nbrs

(Adamic/Adar)

3 followers

1000

followers

Prolific common friends

Less evidence

Less prolific

Much more evidence

Alice

Bob

Charlie

Link Prediction Heuristics

Predict link between nodes Connected by the shortest path With the most common neighbors (length 2 paths) More weight to low-degree common nbrs (Adamic/Adar) With more short paths (e.g. length 3 paths )

exponentially decaying weights to longer paths (Katz measure)

…

Previous Empirical Studies*

Random Shortest Path

Common Neighbors

Adamic/Adar Ensemble of short paths

Link

pre

dict

ion

accu

racy

*

*Liben-Nowell & Kleinberg, 2003; Brand, 2005; Sarkar & Moore, 2007

How do we justify these observations?

Especially if the graph is sparse

6

Link Prediction – Generative Model

Unit volume universe

Model:1. Nodes are uniformly distributed points in a latent space

2. This space has a distance metric

3. Points close to each other are likely to be connected in the graph

Logistic distance function (Raftery+/2002)

7

8

1

½

Higher probability of linking

radius r

α determines the steepness

Link prediction ≈ find nearest neighbor who is not currently linked to the node.

Equivalent to inferring distances in the latent space

Link Prediction – Generative Model

Model:1. Nodes are uniformly distributed points in a latent space

2. This space has a distance metric

3. Points close to each other are likely to be connected in the graph



Common Neighbors


Link

pre

dict

ion

accu

racy

*



9

Common Neighbors

Pr2(i,j) = Pr(common neighbor|dij)

jkikijjkikjkik2 dd)d|d,d()d|~Pr()d|~Pr(j)(i,Pr Pkjki

Product of two logistic probabilities, integrated over a volume determined by dij

i j

10

As α∞ Logistic Step function

Much easier to analyze!

Common Neighbors

11

Everyone has same radius r

i j

)dr,A(r,j)(i,Pr ij2

# common nbrs gives a bound

on distance

DD

rr/2

ij

/1

ij

V(r)

εη/N12d

V(r)

εη/N12

21εN

η)dr,A(r,ε

N

ηP

η=Number of common

neighbors

V(r)=volume of radius r in

D dims

Unit volume universe

Common Neighbors

OPT = node closest to i MAX = node with max common neighbors with i

Theorem:

w.h.p

Link prediction by common neighbors is asymptotically optimal

dOPT ≤ dMAX ≤ dOPT + 2[ε/V(1)]1/D

12

Common Neighbors: Distinct Radii Node k has radius rk .

ik if dik ≤ rk (Directed graph) rk captures popularity of node k

13

Type 2: i k j

rk rk

A(rk , rk ,dij)

i jk

Type 1: i k j

rirj

A(ri , rj ,dij)

i jk

i

rk

k

j

m

Type 2 common neighbors

i j

kη1 ~ Bin[N1 , A(r1, r1,

dij)]

η2 ~ Bin[N2 , A(r2, r2, dij)]

Example graph:

N1 nodes of radius r1 and N2 nodes of radius r2

r1 << r2

Pick d* to maximize Pr[η1 , η2 | dij]

w(r1) E[η1|d*] + w(r2) E[η2|d*] = w(r1)η1 + w(r2) η2

Weighted common neighbors

Inversely related to d*

Common Neighbors: Distinct Radii Node k has radius rk .

ik if dik ≤ rk (Directed graph) rk captures popularity of node k

“Weighted” common neighbors: Predict (i,j) pairs with highest Σ w(r)η(r)

Weight for nodes of radius r

# common neighbors of radius r

i

rk

k

j

m

15

Type 2 common neighbors

r is close to max radius

D1

deg

const

r

constw(r)

Real world graphs generally fall in this range

i

rk

k

j

Presence of common neighbor is very informative

Absence is very informative

Adamic/Adar

1/r

16



Common Neighbors


Link

pre

dict

ion

accu

racy

*



17

l hop Paths

Common neighbors = 2 hop paths

Analysis of longer paths: two components1. Bounding E(ηl | dij). [ηl = # l hop paths]

Bounds Prl (i,j) by using triangle inequality on a series of common neighbor probabilities.

2. ηl ≈ E(ηl | dij)

Triangulation

l hop Paths


Analysis of longer paths: two components1. Bounding E(ηl | dij). [ηl = # l hop paths]

Bounds Prl (i,j) by using triangle inequality on a series of common neighbor probabilities.

2. ηl ≈ E(ηl | dij) Bounded dependence of ηl on position of each node

Can use McDiarmid’s inequality to bound

|ηl - E(ηl|dij)|

ℓ-hop Paths


For longer paths:

Bounds are weaker For ℓ’ ≥ ℓ we need ηℓ’ >> ηℓ to obtain similar bounds

justifies the exponentially decaying weight given to longer paths by the Katz measure

δN,,ηg-11)r(rdij

20

Summary

Three key ingredients

1. Closer points are likelier to be linked. Small World Model- Watts, Strogatz, 1998, Kleinberg 2001

2. Triangle inequality holds necessary to extend to ℓ-hop paths

3. Points are spread uniformly at random Otherwise properties will depend on location as well as distance

21

Summary


Common Neighbors


Link

pre

dict

ion

accu

racy

*


The number of paths matters, not the

length

For large dense graphs, common neighbors are

enough

Differentiating between different degrees is

important

In sparse graphs, paths of length 3 or

more help in prediction.

22

Sweep Estimators

Qr = Fraction of nodes with radius ≤ r which are

common neighbors

TR = Fraction of nodes with radius ≥ R which

are common neighbors

Number of common

neighbors of a given radius r

Large Qr small dijSmall TR large dij

A Theoretical Justification of Link Prediction Heuristics

Documents

Transcript of A Theoretical Justification of Link Prediction Heuristics