Fast pair-wise and node-wise algorithms for commute times and Katz scores
-
Upload
david-gleich -
Category
Education
-
view
1.426 -
download
0
description
Transcript of Fast pair-wise and node-wise algorithms for commute times and Katz scores
FAST MATRIX COMPUTATIONS FOR
PAIRWISE AND COLUMN-WISE KATZ
SCORES AND COMMUTE TIMES
David F. Gleich
Purdue University
Emory University
Math/CS Seminar
January 20, 2012
With Pooya Esfandiar, Francesco Bonchi, Chen Greif,
Laks V. S. Lakshmanan, and Byung-Won On
David F. Gleich (Purdue) Emory Math/CS Seminar 1 / 47
MAIN RESULTS
A β adjacency matrix
L β Laplacian matrix
Katz score :
β
Commute time:
β
For Commute
Compute one β fast
Estimate β
David F. Gleich (Purdue) Emory Math/CS Seminar
For Katz Compute one β β β fast Compute top β β β fast
2 of 47
OUTLINE
Why study these measures?
Katz Rank and Commute Time
How else do people compute them?
Quadrature rules for pairwise scores
Sparse linear systems solves for top-k
As many results as we have time forβ¦
David F. Gleich (Purdue) Emory Math/CS Seminar 3 of 47
WHY? LINK PREDICTION
David F. Gleich (Purdue) Emory Math/CS Seminar
Liben-Nowell and Kleinberg 2003, 2006 found that path based link prediction was more efficient
Neighborhood based
Path based
4 of 47
NOTE
All graphs are undirected
All graphs are connected
David F. Gleich (Purdue) Emory Math/CS Seminar 5 of 47
LEO KATZ
David F. Gleich (Purdue) Emory Math/CS Seminar 6 of 47
NOT QUITE, WIKIPEDIA
β β : adjacency, β β β β β β β β β β : random walk
PageRank β β β β β β β β β β β β β
Katz β β β β β β β β β β β β β
These are equivalent if β β has constant degree
David F. Gleich (Purdue) Emory Math/CS Seminar 7 of 47
WHAT KATZ ACTUALLY SAID
Leo Katz 1953, A New Status Index Derived from Sociometric Analysis, Psychometria 18(1):39-43
βwe assume that each link independently has the
same probability of being effectiveβ β¦
βwe conceive a constant β β , depending
on the group and the context of the particular
investigation, which has the force of a probability
of effectiveness of a single link. A k-step chain
then, has probability β β β of being effective.β
βWe wish to find the column sums of the matrixβ
David F. Gleich (Purdue) Emory Math/CS Seminar 8 of 47
A MODERN TAKE
The Katz score (node-based) is
β β β β β β β β β β
The Katz score (edge-based) is
β β β β β β β β β β
David F. Gleich (Purdue) Emory Math/CS Seminar 9 of 47
RETURNING TO THE MATRIX
β β β β β β β β β β β
β β β β β β β β β β β β β β β β β β β β β β β β β β β
Carl Neumann β β β β β β β β
β β β β β β β β β β β β β β β β β β David F. Gleich (Purdue) Emory Math/CS Seminar 10 of 47
Carl Neumann
Iβve heard the Neumann series called the βvon Neumannβ
series more than Iβd like! In fact, the von Neumann kernel
of a graph should be named the βNeumannβ kernel!
David F. Gleich (Purdue) Emory Math/CS Seminar
Wikipedia page
11 / 47
PROPERTIES OF KATZβS MATRIX
β is symmetric
β exists when β
β is spd when β
Note that β 1/max-degree suffices
David F. Gleich (Purdue) Emory Math/CS Seminar 12 of 47
COMMUTE TIME
David F. Gleich (Purdue) Emory Math/CS Seminar
Picture taken from Google images, seems to be
Bay Bridge Traffic by Jim M. Goldstein.
13 / 47
COMMUTE TIME
Consider a uniform random walk on a graph β β β β β β β β β β
β β β β β β β β β β β β β β β β β β β β β β β β β β β β
β β β β β β β β β β β β β β β β β β β β β β β β
David F. Gleich (Purdue) Emory Math/CS Seminar
Also called the hitting
time from node i to j, or
the first transition time
14 of 47
SKIPPING DETAILS
β : graph Laplacian
β
β is the only null-vector
David F. Gleich (Purdue) Emory Math/CS Seminar 15 of 47
WHAT DO OTHER PEOPLE DO?
1) Just work with the linear algebra formulations
2) For Katz, Truncate the Neumann series as a few (3-5) terms, or MMQ (Benzi & Boito)
3) Use low-rank approximations from EVD(A) or EVD(L)
4) For commute, use Johnson-Lindenstrauss inspired random sampling
5) Approximately decompose into smaller problems
David F. Gleich (Purdue) Emory Math/CS Seminar
Liben-Nowell and Kleinberg CIKM2003, Acar et al. ICDM2009,
Spielman and Srivastava STOC2008, Sarkar and Moore UAI2007, Wang et al. ICDM2007 16 of 47
THE PROBLEM
All of these techniques are
preprocessing based because
most peopleβs goal is to compute
all the scores.
We want to avoid
preprocessing the graph.
David F. Gleich (Purdue) Emory Math/CS Seminar
There are a few caveats here! i.e. one could solve the system instead of looking for the matrix inverse
17 of 47
WHY NO PREPROCESSING?
The graph is constantly changing
as I rate new movies.
David F. Gleich (Purdue) Emory Math/CS Seminar 18 of 47
WHY NO PREPROCESSING?
David F. Gleich (Purdue) Emory Math/CS Seminar
Top-k predicted βlinksβ
are movies to watch!
Pairwise scores give
user similarity
19 of 47
PAIR-WISE ALGORITHMS
David F. Gleich (Purdue) Emory Math/CS Seminar 20 / 47
PAIRWISE ALGORITHMS
Katz β
Commute β
David F. Gleich (Purdue) Emory Math/CS Seminar
Golub and Meurant
to the rescue!
21 of 47
MMQ - THE BIG IDEA
Quadratic form β β
Weighted sum β β
Stieltjes integral β β
Quadrature approximation β β
Matrix equation β David F. Gleich (Purdue) Emory Math/CS Seminar
Think β
A is s.p.d. use EVD
βA tautologyβ
Lanczos
22 of 47
LANCZOS
β , k-steps of the Lanczos method produce
β and β
David F. Gleich (Purdue) Emory Math/CS Seminar
β β β β β β β β β β =
β β β β β β β
23 of 47
PRACTICAL LANCZOS
Only need to store the last 2 vectors in β
Updating requires O(matvec) work
β is not orthogonal
David F. Gleich (Purdue) Emory Math/CS Seminar 24 of 47
MMQ PROCEDURE
Goal β
Given β
1. Run k-steps of Lanczos on β starting with β
2. Compute β , β with an additional eigenvalue at β ,
set β 3. Compute β , β with an additional eigenvalue at β , set
β 4. Output β as lower and upper bounds on β
David F. Gleich (Purdue) Emory Math/CS Seminar
Correspond to a Gauss-Radau rule, with
u as a prescribed node
Correspond to a Gauss-Radau rule, with
l as a prescribed node
25 of 47
PRACTICAL MMQ
Increase k to become more accurate
Bad eigenvalue bounds yield worse results
β and β are easy to compute
β not required, we can iteratively
update itβs LU factorization
David F. Gleich (Purdue) Emory Math/CS Seminar 26 of 47
PRACTICAL MMQ
David F. Gleich (Purdue) Emory Math/CS Seminar 27 of 47
ONE LAST STEP FOR KATZ
Katz β
β
β
β
David F. Gleich (Purdue) Emory Math/CS Seminar 28 of 47
COLUMN-WISE
ALGORITHMS
David F. Gleich (Purdue) Emory Math/CS Seminar 29 / 47
COLUMN-WISE COMMUTE-TIME
β
β requires entire diagonal of β
β
β
David F. Gleich (Purdue) Emory Math/CS Seminar
Paige and Saunders, 1975
β β β β β β β β β β
β β β β β β β
Each vector β is computed by the a Lanczos based CG algorithm.
30 of 47
β = β β
β
COLUMN-WISE COMMUTE TIME
David F. Gleich (Purdue) Emory Math/CS Seminar
β following Hein et al. 2010 is a MUCH better and faster approximation.
31
KATZ SCORES ARE LOCALIZED
David F. Gleich (Purdue) Emory Math/CS Seminar
Up to 50 neighbors is 99.65% of the total mass
32 of 47
PARTICIPATION RATIOS
David F. Gleich (Purdue) Emory Math/CS Seminar
Participation
Ratios
βeffective non-zerosβ in a vector
β
33 of 47
TOP-K ALGORITHM FOR KATZ
Approximate β
β
where β is sparse
Keep β sparse too
Ideally, donβt βtouchβ all of β
David F. Gleich (Purdue) Emory Math/CS Seminar 34 of 47
INSPIRATION - PAGERANK
Approximate β
β
where β is sparse
Keep β sparse too? YES!
Ideally, donβt βtouchβ all of β ? YES!
David F. Gleich (Purdue) Emory Math/CS Seminar
McSherry WWW2005, Berkhin 2007, Anderson et al. FOCS2008 β Thanks to Reid Anderson for telling me McSherry did this too.
35 of 47
THE ALGORITHM - MCSHERRY
For β
Start with the Richardson iteration
β
Rewrite
β
Richardson converges if β
David F. Gleich (Purdue) Emory Math/CS Seminar 36 of 47
THE ALGORITHM
Note β is sparse.
If β , then β is sparse.
Idea
only add one component of β to β
David F. Gleich (Purdue) Emory Math/CS Seminar 37 of 47
THE ALGORITHM
For β
Init: β
β
β
How to pick β ?
David F. Gleich (Purdue) Emory Math/CS Seminar 38 of 47
THE ALGORITHM FOR KATZ
For β
Init: β
β
β
Pick β as max β David F. Gleich (Purdue) Emory Math/CS Seminar
Storing the non-zeros of the residual in a heap makes picking the max log(n) time. See Anderson et al. FOCS2008 for more
39 of 47
CONVERGENCE?
If β 1/max-degree then β is sub-stochastic and the PageRank based proof applies because the matrix is diagonally dominant
For β , then for symmetric β , this algorithm is the Gauss-Southwell procedure and it still converges.
David F. Gleich (Purdue) Emory Math/CS Seminar
40 of 47
NONSYMMETRIC ALGORITHM?
Was this algorithm known in the non-symmetric case?
YES! Itβs just Gauss-Seidel/Gauss-Southwell, but with a column vs. row orientation.
Called dynamic residual to AMGers
David F. Gleich (Purdue) Emory Math/CS Seminar 41 of 47
RESULTS β DATA, PARAMETERS
All unweighted, connected graphs
Easy β β β β β β β β β β β β β β β β β β
Hard β β β β β β β β β β β β β β
David F. Gleich (Purdue) Emory Math/CS Seminar 42 of 47
KATZ BOUND CONVERGENCE
David F. Gleich (Purdue) Emory Math/CS Seminar 43 of 47
COMMUTE BOUND CONVERG.
David F. Gleich (Purdue) Emory Math/CS Seminar 44 of 47
KATZ SET CONVERGENCE
David F. Gleich (Purdue) Emory Math/CS Seminar
For arXiv graph.
45 of 47
TIMING
David F. Gleich (Purdue) Emory Math/CS Seminar 46 of 47
CONCLUSIONS
These algorithms are faster than many alternatives.
For pairwise commute, stopping criteria are simpler with bounds
For top-k problems, we often need less than 1 matvec for good enough results
David F. Gleich (Purdue) Emory Math/CS Seminar 47 of 47
By AngryDogDesign on DeviantArt
Paper at WAW2010 and J. Internet Mathematics
Slides online
Code is online already
www.cs.purdue.edu/
homes/dgleich/codes/fast-katz-2011
David F. Gleich (Purdue) Emory Math/CS Seminar 48