Link Prediction - ETH Zürich · PDF...

26
| | COSS Lloyd Sanders, Olivia Woolley, Dirk Helbing Computational Social Science Link Prediction

Transcript of Link Prediction - ETH Zürich · PDF...

Page 1: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

Lloyd Sanders, Olivia Woolley, Dirk Helbing Computational Social Science

Link Prediction

Page 2: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

!  What is link prediction? !  What are some examples of link prediction? !  A How To

!  The setup !  Local Methods !  Global Methods

!  Challenges of Link Prediction !  References

L Sanders 2

Overview

Page 3: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

Abstractly: Given a snapshot of a network, can one predict the next most likely links to form in the network?

L Sanders 3

Statement of the Problem

http://ifisc.uib-csic.es/~jramasco/ComplexNets.html

Zachary Karate Club

Page 4: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 4

Link Prediction in Industry

Proposing dates

Proposing Friendships

Proposing items to purchase

Page 5: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 5

Link Prediction in Science

Artist’s rendition of Human Metabolic network Wikipedia

Page 6: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 6

Link Prediction in Science

Wikipedia

Investigating link connections in bio. networks is costly and time consuming.

Page 7: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 7

Link Prediction

RESEARCH ARTICLE

Link Prediction in Criminal Networks: A Toolfor Criminal Intelligence AnalysisGiulia Berlusconi1, Francesco Calderoni1*, Nicola Parolini2, Marco Verani2,Carlo Piccardi3*

1UniversitàCattolica del Sacro Cuore and Transcrime, Milano, Italy, 2MOX, Department of Mathematics,Politecnico di Milano, Milano, Italy, 3Department of Electronics, Information and Bioengineering, Politecnicodi Milano, Milano, Italy

* [email protected] (FC); [email protected] (CP)

AbstractThe problem of link prediction has recently received increasing attention from scholars innetwork science. In social network analysis, one of its aims is to recover missing links,namely connections among actors which are likely to exist but have not been reportedbecause data are incomplete or subject to various types of uncertainty. In the field of crimi-nal investigations, problems of incomplete information are encountered almost by definition,given the obvious anti-detection strategies set up by criminals and the limited investigativeresources. In this paper, we work on a specific dataset obtained from a real investigation,and we propose a strategy to identify missing links in a criminal network on the basis of thetopological analysis of the links classified as marginal, i.e. removed during the investigationprocedure. The main assumption is that missing links should have opposite features withrespect to marginal ones. Measures of node similarity turn out to provide the best character-ization in this sense. The inspection of the judicial source documents confirms that the pre-dicted links, in most instances, do relate actors with large likelihood of co-participation inillicit activities.

IntroductionCriminal intelligence analysis aims at supporting investigations, e.g. by producing link chartsto identify and target key actors. Law enforcement agencies increasingly use Social NetworkAnalysis (SNA) for criminal intelligence, analyzing the relations among individuals based oninformation on activities, events, and places derived from various investigative activities [1–3].SNA provides added value compared to more traditional approaches like link analysis, byenabling in-depth assessment of the internal structure of criminal groups and by providingstrategic and tactical advantages. For instance, SNA can inform law enforcement officers in theidentification of aliases during large investigations and in the collection of evidence for prose-cution [2]. Furthermore, the network analysis of criminal groups under investigation may helpidentify effective strategies to achieve network destabilization or disruption [3, 4].

PLOSONE | DOI:10.1371/journal.pone.0154244 April 22, 2016 1 / 21

a11111

OPEN ACCESS

Citation: Berlusconi G, Calderoni F, Parolini N,Verani M, Piccardi C (2016) Link Prediction inCriminal Networks: A Tool for Criminal IntelligenceAnalysis. PLoS ONE 11(4): e0154244. doi:10.1371/journal.pone.0154244

Editor: Daniele Marinazzo, Universiteit Gent,BELGIUM

Received: November 3, 2015

Accepted: April 11, 2016

Published: April 22, 2016

Copyright: © 2016 Berlusconi et al. This is an openaccess article distributed under the terms of theCreative Commons Attribution License, which permitsunrestricted use, distribution, and reproduction in anymedium, provided the original author and source arecredited.

Data Availability Statement: All network data areavailable at Figshare (https://dx.doi.org/10.6084/m9.figshare.3156067).

Funding: The Polisocial Award program is the onlyfunding source and it supports the authors NP andMV. No other fund was available. The funders had norole in study design, data collection and analysis,decision to publish, or preparation of the manuscript.

Competing Interests: The authors have declaredthat no competing interests exist.

Possible missing links predicted between suspects in organized

crime

Page 8: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

Link prediction is used to predict future possible links in the network (E.g., Facebook). Or, it can be used to predict missing links due to incomplete data (E.g., Food-webs)

L Sanders 8

Link Prediction

Page 9: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

Abstractly: Given a snapshot of a network, can one predict the next most likely links to form in the network?

L Sanders 9

Statement of the Problem

Graph

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

1

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

1

Edge

Similarity Score

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

1

Page 10: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

!  For a given graph !  Split the data into a training set, and a validation set !  Choose a link prediction algorithm !  Run the algorithm on the training set, and test it on the

validation set. !  Check the accuracy !  Compare other link prediction algorithms

L Sanders 10

How To: The Set up

Page 11: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 11

How To: The Setup

Split Edges into a training and test (probe) set

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

1

Set of all possible edges on graph

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

1

Page 12: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

Our algorithms consider only the edges that connect the same set of nodes in the training and probe set – usually taken as the nodes of the giant component of the graph

L Sanders 12

How To: The Setup – quick note

Networkx.github.io

This assumes the graph is static – no new nodes enter the system

By definition: The algorithm will not predict edges for nodes not within the giant component More about this later

Page 13: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

The Link Prediction algorithm will spit out a list, ranked with edges which are most likely to appear at the top, descending.

L Sanders 13

How To: Output of algorithm

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

E

P ✓ U � E

T

List outputted by algorithm of links ranked most likely to least likely to form

on the graph

L : e

L

2 U � E

T

1

Taking the first n links from the list, and calculating the intersection with the probe set of length n, gives a simple measure of accuracy

Page 14: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

Links are predicted based solely on your local contact structure. The main idea is that of triangle closing

L Sanders 14

Local Methods – a basic example

Page 15: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 15

How to: Similarity Measures

L. Lü, T. Zhou / Physica A 390 (2011) 1150–1170 1153

s45 = 0.6. To calculate AUC, we need to compare the scores of a probe link and a nonexistent link. There are six pairs in total:s13 > s12, s13 < s14, s13 = s34, s45 > s12, s45 = s14 and s45 > s34. Hence, the AUC value equals (3 ⇥ 1 + 2 ⇥ 0.5)/6 ⇡ 0.67.For precision, if L = 2, the predicted links are (1, 4) and (4, 5). Clearly, the former is wrong while the latter is right, and thusthe precision equals 0.5.

3. Similarity-based algorithms

The simplest framework of link prediction methods is the similarity-based algorithm, where each pair of nodes, x and y,is assigned a score sxy, which is directly defined as the similarity (or called proximity in the literature) between x and y. Allnon-observed links are ranked according to their scores, and the links connecting more similar nodes are supposed to be ofhigher existence likelihoods. In despite of its simplicity, the study on similarity-based algorithms is themainstream issue. Infact, the definition of node similarity is a nontrivial challenge. Similarity index can be very simple or very complicated andit may work well for some networks while fails for some others. In addition, the similarities can be used in a more skilledway, such as being locally integrated under the collaborative filtering5 framework [34].

Node similarity can be defined by using the essential attributes of nodes: two nodes are considered to be similar ifthey have many common features [35]. However, the attributes of nodes are generally hidden, and thus we focus onanother group of similarity indices, called structural similarity, which is based solely on the network structure. The structuralsimilarity indices can be classified in various ways, such as local vs. global, parameter-free vs. parameter-dependent,node-dependent vs. path-dependent, and so on. The similarity indices can also be sophisticatedly classified as structuralequivalence and regular equivalence. The former embodies a latent assumption that the link itself indicated a similaritybetween two endpoints (see, for example, the Leicht–Holme–Newman index [36] and transferring similarity [37]), while thelatter assumes that two nodes are similar if their neighbors are similar. Readers are encouraged to see Ref. [38] for themathematical definition of regular equivalence and Ref. [39] for a recent application on the prediction of protein functions.

Here we adopt the simplest method, where 20 similarity indices are classified into three categories: the former 10 arelocal indices, followed by 7 global indices, and the last 3 are quasi-local indices, which do not require global topologicalinformation but make use of more information than local indices.

3.1. Local similarity indices

(1) Common Neighbors (CN). For a node x, let � (x) denote the set of neighbors of x. In common sense, two nodes, x and y,are more likely to have a link if they have many common neighbors. The simplest measure of this neighborhood overlap isthe directed count, namely

sCNxy = |� (x) \ � (y)|, (2)

where |Q | is the cardinality of the set Q . It is obvious that sxy = (A2)xy, where A is the adjacency matrix: Axy = 1 if xand y are directly connected and Axy = 0 otherwise. Note that, (A2)xy is also the number of different paths with length 2connecting x and y. Newman [40] used this quantity in the study of collaboration networks, showing a positive correlationbetween the number of common neighbors and the probability that two scientists will collaborate in the future. Kossinetsand Watts [14] analyzed a large-scale social network, suggesting that two students having many mutual friends are veryprobable to be friends in future. The following six indices are also based on the number of common neighbors, yet withdifferent normalization methods.

(2) Salton Index [6]. It is defined as

sSaltonxy = |� (x) \ � (y)|pkx ⇥ ky

, (3)

where kx is the degree of node x. The Salton index is also called the cosine similarity in the literature.(3) Jaccard Index [41]. This index was proposed by Jaccard over a hundred years ago, and is defined as

sJaccardxy = |� (x) \ � (y)||� (x) [ � (y)| . (4)

(4) Sørensen Index [42]. This index is used mainly for ecological community data, and is defined as

sSørensenxy = 2|� (x) \ � (y)|kx + ky

. (5)

5 Collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration amongmultiple agents, viewpoints,data sources, etc. [33].

Common Neighbours

Counting number of paths of a certain length

Page 16: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 16

How to: Similarity Measures

1154 L. Lü, T. Zhou / Physica A 390 (2011) 1150–1170

(5) Hub Promoted Index (HPI) [43]. This index is proposed for quantifying the topological overlap of pairs of substrates inmetabolic networks, and is defined as

sHPIxy = |� (x) \ � (y)|min{kx, ky}

. (6)

Under thismeasurement, the links adjacent to hubs are likely to be assignedhigh scores since the denominator is determinedby the lower degree only.

(6) Hub Depressed Index (HDI). Analogously to the above index, we also consider a measurement with the opposite effecton hubs, defined as

sHDIxy = |� (x) \ � (y)|max{kx, ky}

. (7)

(7) Leicht–Holme–Newman Index (LHN1) [36]. This index assigns high similarity to node pairs that have many commonneighbors compared not to the possible maximum, but to the expected number of such neighbors. It is defined as

sLHN1xy = |� (x) \ � (y)|kx ⇥ ky

, (8)

where the denominator, kx ⇥ ky, is proportional to the expected number of common neighbors of nodes x and y in theconfiguration model [44]. We use the abbreviation LHN1 to distinguish this index to another index (named as LHN2 index)also proposed by Leicht, Holme and Newman.

(8) Preferential Attachment Index (PA). The mechanism of preferential attachment can be used to generate evolving scale-free networks,where the probability that a new link is connected to the node x is proportional to kx [45]. A similarmechanismcan also lead to scale-free networks without growth [46], where at each time step, an old link is removed and a new link isgenerated. The probability that this new link will connect x and y is proportional to kx ⇥ ky. Motivated by this mechanism,the corresponding similarity index can be defined as

sPAxy = kx ⇥ ky, (9)

which has beenwidely used to quantify the functional significance of links subject to various network-based dynamics, suchas percolation [47], synchronization [48] and transportation [49]. Note that, this index does not require the information ofthe neighborhood of each node, as a consequence, it has the least computational complexity.

(9) Adamic–Adar Index (AA) [50]. This index refines the simple counting of common neighbors by assigning the less-connected neighbors more weights, and is defined as

sAAxy =X

z2� (x)\� (y)

1log kz

. (10)

(10) Resource Allocation Index (RA) [51]. This index is motivated by the resource allocation dynamics on complexnetworks [52]. Consider a pair of nodes, x and y, which are not directly connected. The node x can send some resourceto y, with their common neighbors playing the role of transmitters. In the simplest case, we assume that each transmitterhas a unit of resource, and will equally distribute it to all its neighbors. The similarity between x and y can be defined as theamount of resource y received from x, which is

sRAxy =X

z2� (x)\� (y)

1kz

. (11)

Clearly, this measure is symmetric, namely sxy = syx. Note that, although resulting from different motivations, the AA indexand RA index have very similar form. Indeed, they both depress the contribution of the high-degree common neighbors. AAindex takes the form (log kz)�1 while RA index takes the form k�1

z . The difference is insignificant when the degree, kz , issmall, while it is considerable when kz is large. In other words, RA index punishes the high-degree common neighbors moreheavily than AA.

Liben-Nowell et al. [58] and Zhou et al. [51] systematically compared a number of local similarity indices on manyreal networks: the former [58] focuses on social collaboration networks and the latter [51] considers disparate networksincluding the protein–protein interaction network, electronic grid, Internet, US airport network, etc. According to extensiveexperimental results on real networks (see results in Table 1), the RA index performs best, while AA and CN indices have thesecond best overall performance among all the above-mentioned local indices.

The PA index has the worst overall performance, yet we are interested in it for it requires the least information. Notethat, PA performs even worst than pure chance for the Internet at router level and the power grid. In these two networks,the nodes have well-defined positions and the links are physical lines. Actually, geography plays a significant role and linkswith very long geographical distances are rare. As local centers, the high-degree nodes have longer geographical distancesto each other than average, and thus have a lower probability of directly connecting to each other, which leads to the bad

L. Lü, T. Zhou / Physica A 390 (2011) 1150–1170 1153

s45 = 0.6. To calculate AUC, we need to compare the scores of a probe link and a nonexistent link. There are six pairs in total:s13 > s12, s13 < s14, s13 = s34, s45 > s12, s45 = s14 and s45 > s34. Hence, the AUC value equals (3 ⇥ 1 + 2 ⇥ 0.5)/6 ⇡ 0.67.For precision, if L = 2, the predicted links are (1, 4) and (4, 5). Clearly, the former is wrong while the latter is right, and thusthe precision equals 0.5.

3. Similarity-based algorithms

The simplest framework of link prediction methods is the similarity-based algorithm, where each pair of nodes, x and y,is assigned a score sxy, which is directly defined as the similarity (or called proximity in the literature) between x and y. Allnon-observed links are ranked according to their scores, and the links connecting more similar nodes are supposed to be ofhigher existence likelihoods. In despite of its simplicity, the study on similarity-based algorithms is themainstream issue. Infact, the definition of node similarity is a nontrivial challenge. Similarity index can be very simple or very complicated andit may work well for some networks while fails for some others. In addition, the similarities can be used in a more skilledway, such as being locally integrated under the collaborative filtering5 framework [34].

Node similarity can be defined by using the essential attributes of nodes: two nodes are considered to be similar ifthey have many common features [35]. However, the attributes of nodes are generally hidden, and thus we focus onanother group of similarity indices, called structural similarity, which is based solely on the network structure. The structuralsimilarity indices can be classified in various ways, such as local vs. global, parameter-free vs. parameter-dependent,node-dependent vs. path-dependent, and so on. The similarity indices can also be sophisticatedly classified as structuralequivalence and regular equivalence. The former embodies a latent assumption that the link itself indicated a similaritybetween two endpoints (see, for example, the Leicht–Holme–Newman index [36] and transferring similarity [37]), while thelatter assumes that two nodes are similar if their neighbors are similar. Readers are encouraged to see Ref. [38] for themathematical definition of regular equivalence and Ref. [39] for a recent application on the prediction of protein functions.

Here we adopt the simplest method, where 20 similarity indices are classified into three categories: the former 10 arelocal indices, followed by 7 global indices, and the last 3 are quasi-local indices, which do not require global topologicalinformation but make use of more information than local indices.

3.1. Local similarity indices

(1) Common Neighbors (CN). For a node x, let � (x) denote the set of neighbors of x. In common sense, two nodes, x and y,are more likely to have a link if they have many common neighbors. The simplest measure of this neighborhood overlap isthe directed count, namely

sCNxy = |� (x) \ � (y)|, (2)

where |Q | is the cardinality of the set Q . It is obvious that sxy = (A2)xy, where A is the adjacency matrix: Axy = 1 if xand y are directly connected and Axy = 0 otherwise. Note that, (A2)xy is also the number of different paths with length 2connecting x and y. Newman [40] used this quantity in the study of collaboration networks, showing a positive correlationbetween the number of common neighbors and the probability that two scientists will collaborate in the future. Kossinetsand Watts [14] analyzed a large-scale social network, suggesting that two students having many mutual friends are veryprobable to be friends in future. The following six indices are also based on the number of common neighbors, yet withdifferent normalization methods.

(2) Salton Index [6]. It is defined as

sSaltonxy = |� (x) \ � (y)|pkx ⇥ ky

, (3)

where kx is the degree of node x. The Salton index is also called the cosine similarity in the literature.(3) Jaccard Index [41]. This index was proposed by Jaccard over a hundred years ago, and is defined as

sJaccardxy = |� (x) \ � (y)||� (x) [ � (y)| . (4)

(4) Sørensen Index [42]. This index is used mainly for ecological community data, and is defined as

sSørensenxy = 2|� (x) \ � (y)|kx + ky

. (5)

5 Collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration amongmultiple agents, viewpoints,data sources, etc. [33].

Jaccard Index

Resource Allocation

Many more similarity measures available

Page 17: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

!  Given the adjacency matrix, one can take the global structure of the graph into account when making predictions.

!  These are often based on number of shortest path measures.

!  Common neighbors is simply the Adj. matrix squared.

L Sanders 17

Global Methods

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

E

P ✓ U � E

T

List outputted by algorithm of links ranked most likely to least likely to form

on the graph

L : e

L

2 U � E

T

Common neighbors as a global method.

s

CN

xy

= (A

2)

xy

s

xy

= exp(↵A)|xy

=

1X

i=0

i

i!

A

i|xy

1

Try it for yourself to verify it!

Page 18: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

!  What about paths of greater length? How can we include those? We can weight longer paths less.

L Sanders 18

Global Methods

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

E

P ✓ U � E

T

List outputted by algorithm of links ranked most likely to least likely to form

on the graph

L : e

L

2 U � E

T

Common neighbors as a global method.

s

CN

xy

= (A

2)

xy

s

xy

= exp(↵A)|xy

=

1X

i=0

i

i!

A

i|xy

1

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

E

P ✓ U � E

T

List outputted by algorithm of links ranked most likely to least likely to form

on the graph

L : e

L

2 U � E

T

Common neighbors as a global method.

s

CN

xy

= (A

2

)

xy

s

xy

= exp(↵A)|xy

=

1X

i=0

i

i!

A

i|xy

exp(↵A) = 1 + ↵A+

(↵

2

A

2

)

2!

+

(↵

3

A

3

)

3!

+ · · ·

Global method: Splitting the data further

E

T

= E

T

0[ E

P

0

Suppose we want to carry out the following mapping

F (AT

0) w AP

0

min||F (AT

0)�AP

0||Frobenius

1

Page 19: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 19

Link Prediction on Bipartite Graphs

Wikipedia

Page 20: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 20

Link Prediction on Bipartite Graphs

Fig. 2. In this curve fitting plot of the Slovak Wikipedia, the hyperbolic sine is a goodmatch, indicating that the hyperbolic sine pseudokernel performs well.

where coefficients are decreasing with path length. Keeping only the odd com-ponent, we arrive at the matrix hyperbolic sine [16].

sinh(αA) =∞!

i=0

α1+2i

(1 + 2i)!A1+2i

Figure 2 shows the hyperbolic sine applied to the (positive) spectrum of thebipartite Slovak Wikipedia user–article edit network.

The Odd von Neumann Pseudokernel The von Neumann kernel for uni-partite graphs is given by the following expression [13].

KNEU(A) = (I − αA)−1 =∞!

i=0

αiAi

We call its odd component the odd von Neumann pseudokernel:

KoddNEU(A) = αA(I − α2A2)−1 =

∞!

i=0

α1+2iA1+2i

The hyperbolic sine and von Neumann pseudokernels are compared in Fig-ure 3, based on the path weights they produce.

Consider an odd function for the graph kernel

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

E

P ✓ U � E

T

List outputted by algorithm of links ranked most likely to least likely to form

on the graph

L : e

L

2 U � E

T

Common neighbors as a global method.

s

CN

xy

= (A

2

)

xy

s

xy

= exp(↵A)|xy

=

1X

i=0

i

i!

A

i|xy

Global method: Splitting the data further

E

T

= E

T

0[ E

P

0

Suppose we want to carry out the following mapping

F (AT

0) w AP

0

min||F (AT

0)�AP

0||Frobenius

min||UF (⇤)U

tr �AP

0||Frobenius

min||F (⇤)� U

trAP

0U ||

Frobenius

Eigen-value decomposition

A = U⇤U

tr

A

3

1

Or something more sophisticated

This is a latex document containing equations for use to put into slides for

the link prediction lecture

G = (V,E)

G

t

= (V

t

, E

t

)

G

t

0= (V

t

, E

t

0)

The number of all possible links

|U | = |V |(|V |� 1)

2

Divide the edges up

E = E

T [ E

P

Definition of an edge, two nodes

e = (x, y)

Similarity of two nodes is defined as

s

xy

E

P ✓ U � E

T

List outputted by algorithm of links ranked most likely to least likely to form

on the graph

L : e

L

2 U � E

T

Common neighbors as a global method.

s

CN

xy

= (A

2

)

xy

s

xy

= exp(↵A)|xy

=

1X

i=0

i

i!

A

i|xy

exp(↵A) = 1 + ↵A+

(↵

2

A

2

)

2!

+

(↵

3

A

3

)

3!

+ · · ·

sinh(↵A) = ↵A+

(↵A)

3

3!

+

(↵A)

5

5!

+ · · ·

Global method: Splitting the data further

E

T

= E

T

0[ E

P

0

Suppose we want to carry out the following mapping

F (AT

0) w AP

0

1

Page 21: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

Your choice of link prediction algorithm makes an implicit assumption of the graph kernel – the mechanism for how the graph grows. Different graph types (social, academic, user-item) grow under different mechanisms. Therefore different link prediction algorithms will work better on other graphs.

L Sanders 21

Key Tip

Page 22: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

!  Cold Start Problem !  Temporal Networks

!  Links can be created and destroyed over time !  New nodes come in and out of the system

!  Link prediction with links of a different nature – e.g., ‘negative’ links.

!  Eliminating statistical bias in partitioning data – k-fold cross validation.

L Sanders 22

Challenges for Link Prediction

Page 23: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 23

The Cold Start Problem

Page 24: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 24

The Cold Start Problem

Propose links due to node characteristics

Page 25: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS L Sanders 25

The Cold Start Problem – new nodes

?

Page 26: Link Prediction - ETH Zürich · PDF filePolitecnicodiMilano,Milano,Italy,3DepartmentofElectronics,InformationandBioengineering,Politecnico ... The Link Prediction ... Forprecision,if

| | COSS

!  The Link Prediction Problem for Social Networks; Liben-Nowell & Kleinberg !  Link prediction in complex networks: A survey; Lü and Zhou, Phsyica A; 2011 !  Spectral evolution in dynamic networks; Kunegis et al.; Knowl. Inf. Syst.; 2013 !  Online Dating Recommender Systems: The split-complex number approach; Kunegis et al.; ACM

2012 !  Mean Average Precision

!  https://en.wikipedia.org/wiki/Precision_and_recall !  https://en.wikipedia.org/wiki/Information_retrieval !  Victor Lavrenko: youtube.com/watch?v=pM6DJ0ZZee0

L Sanders 26

References