Big Learning with Graph Computation Joseph Gonzalez [email protected] Download the talk:...
-
Upload
edwina-mcdonald -
Category
Documents
-
view
226 -
download
1
Transcript of Big Learning with Graph Computation Joseph Gonzalez [email protected] Download the talk:...
![Page 1: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/1.jpg)
Big Learning with Graph Computation
Joseph [email protected]
Download the talk: http://tinyurl.com/7tojdmwhttp://www.cs.cmu.edu/~jegonzal/talks/biglearning_with_graphs.pptx
![Page 2: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/2.jpg)
48 Hours of Video Uploaded every Minute
750 MillionFacebook Users
6 Billion Flickr Photos
Big Data already Happened
1 Billion TweetsPer Week
![Page 3: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/3.jpg)
How do we understand and use
Big Data?
Big Learning
![Page 4: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/4.jpg)
Big Learning Today: Regression
• Pros:– Easy to understand/predictable– Easy to train in parallel– Supports Feature Engineering– Versatile: classification, ranking, density
estimation
Philosophy of Big Data and Simple Models
Simple Models
![Page 5: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/5.jpg)
“Invariably, simple models and a lot of data trump more elaborate
models based on less data.”
Alon Halevy, Peter Norvig, and Fernando Pereira, Google
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/35179.pdf
![Page 6: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/6.jpg)
“Invariably, simple models and a lot of data trump more elaborate
models based on less data.”
Alon Halevy, Peter Norvig, and Fernando Pereira, Google
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/35179.pdf
![Page 7: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/7.jpg)
Why not build elaborate models with
lots of data?
Difficult ComputationallyIntensive
![Page 8: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/8.jpg)
Big Learning Today: Simple Models
• Pros:– Easy to understand/predictable– Easy to train in parallel– Supports Feature Engineering– Versatile: classification, ranking, density
estimation• Cons:
– Favors bias in the presence of Big Data
–Strong independence assumptions
![Page 9: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/9.jpg)
Shopper 1 Shopper 2
Cameras Cooking
9
Social Network
![Page 10: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/10.jpg)
Big Data exposes the opportunity for structured
machine learning
10
![Page 11: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/11.jpg)
Examples
![Page 12: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/12.jpg)
Profile
Label Propagation• Social Arithmetic:
• Recurrence Algorithm:
– iterate until convergence• Parallelism:
– Compute all Likes[i] in parallel
Sue Ann
Carlos
Me
50% What I list on my profile40% Sue Ann Likes10% Carlos Like
40%
10%
50%
80% Cameras20% Biking
30% Cameras70% Biking
50% Cameras50% Biking
I Like:
+60% Cameras, 40% Biking
http://www.cs.cmu.edu/~zhuxj/pub/CMU-CALD-02-107.pdf
![Page 13: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/13.jpg)
PageRank (Centrality Measures)• Iterate:
• Where:– α is the random reset probability– L[j] is the number of links on page j
1 32
4 65
http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf
![Page 14: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/14.jpg)
Matrix FactorizationAlternating Least Squares (ALS)
r11
r12
r23
r24
u1
u2
m1
m2
m3Use
r Fac
tors
(U)
Movie Factors (M
)U
sers
Movies
Netflix
Use
rs
≈x
Movies
ui
mj
Update Function computes:
http://dl.acm.org/citation.cfm?id=1424269
![Page 15: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/15.jpg)
Other Examples
• Statistical Inference in Relational Models – Belief Propagation– Gibbs Sampling
• Network Analysis– Centrality Measures– Triangle Counting
• Natural Language Processing– CoEM– Topic Modeling
![Page 16: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/16.jpg)
Graph Parallel Algorithms
DependencyGraph
IterativeComputation
My Interests
FriendsInterests
LocalUpdates
![Page 17: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/17.jpg)
?
BeliefPropagation
Label Propagation
KernelMethods
Deep BeliefNetworks
NeuralNetworks
Tensor Factorization
PageRank
Lasso
What is the right tool for Graph-Parallel ML
17
Data-Parallel Graph-Parallel
CrossValidation
Feature Extraction
Map Reduce
Computing SufficientStatistics
Map Reduce?
![Page 18: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/18.jpg)
Why not use Map-Reducefor
Graph Parallel algorithms?
![Page 19: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/19.jpg)
Data Dependencies are Difficult• Difficult to express dependent data in MR
– Substantial data transformations – User managed graph structure– Costly data replication
Inde
pend
ent D
ata
Reco
rds
![Page 20: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/20.jpg)
Iterative Computation is Difficult• System is not optimized for iteration:
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Iterations
Disk Pe
nalty
Disk Pe
nalty
Disk Pe
nalty
Sta
rtup
Pen
alty
Sta
rtup
Pen
alty
Sta
rtup
Pen
alty
![Page 21: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/21.jpg)
BeliefPropagation
SVM
KernelMethods
Deep BeliefNetworks
NeuralNetworks
Tensor Factorization
PageRank
Lasso
Map-Reduce for Data-Parallel ML• Excellent for large data-parallel tasks!
21
Data-Parallel Graph-Parallel
CrossValidation
Feature Extraction
Map Reduce
Computing SufficientStatistics
Map Reduce?MPI/Pthreads
![Page 22: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/22.jpg)
Threads, Locks, & Messages
“low level parallel primitives”
We could use ….
![Page 23: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/23.jpg)
Threads, Locks, and Messages• ML experts repeatedly solve the same
parallel design challenges:– Implement and debug complex parallel system– Tune for a specific parallel platform– Six months later the conference paper contains:
“We implemented ______ in parallel.”• The resulting code:
– is difficult to maintain– is difficult to extend
• couples learning model to parallel implementation23
Graduate
students
![Page 24: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/24.jpg)
BeliefPropagation
SVM
KernelMethods
Deep BeliefNetworks
NeuralNetworks
Tensor Factorization
PageRank
Lasso
Addressing Graph-Parallel ML• We need alternatives to Map-Reduce
Data-Parallel Graph-Parallel
CrossValidation
Feature Extraction
Map Reduce
Computing SufficientStatistics
MPI/PthreadsPregel (BSP)
![Page 25: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/25.jpg)
Barrie
rPregel: Bulk Synchronous Parallel
Compute Communicate
http://dl.acm.org/citation.cfm?id=1807184
![Page 26: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/26.jpg)
Open Source Implementations
• Giraph: http://incubator.apache.org/giraph/• Golden Orb: http://goldenorbos.org/
An asynchronous variant:• GraphLab: http://graphlab.org/
![Page 27: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/27.jpg)
PageRank in Giraph (Pregel)
http://incubator.apache.org/giraph/
public void compute(Iterator<DoubleWritable> msgIterator) {
double sum = 0;while (msgIterator.hasNext())
sum += msgIterator.next().get();DoubleWritable vertexValue =
new DoubleWritable(0.15 + 0.85 * sum);setVertexValue(vertexValue);
if (getSuperstep() < getConf().getInt(MAX_STEPS, -1)) {
long edges = getOutEdgeMap().size();sentMsgToAllEdges(
new DoubleWritable(getVertexValue().get() / edges));
} else voteToHalt();}
Sum PageRank over incoming
messages
![Page 28: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/28.jpg)
Tradeoffs of the BSP Model
• Pros:– Graph Parallel– Relatively easy to implement and reason about– Deterministic execution
![Page 29: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/29.jpg)
Barrie
rEmbarrassingly Parallel Phases
Compute Communicate
http://dl.acm.org/citation.cfm?id=1807184
![Page 30: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/30.jpg)
Tradeoffs of the BSP Model
• Pros:– Graph Parallel– Relatively easy to build– Deterministic execution
• Cons:– Doesn’t exploit the graph structure– Can lead to inefficient systems
![Page 31: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/31.jpg)
Curse of the Slow Job
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
CPU 1
CPU 2
CPU 3
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Iterations
Barr
ier
Barr
ier
Data
Data
Data
Data
Data
Data
Data
Barr
ier
http://www.www2011india.com/proceeding/proceedings/p607.pdf
![Page 32: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/32.jpg)
Tradeoffs of the BSP Model
• Pros:– Graph Parallel– Relatively easy to build– Deterministic execution
• Cons:– Doesn’t exploit the graph structure– Can lead to inefficient systems– Can lead to inefficient computation
![Page 33: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/33.jpg)
Example:Loopy Belief Propagation (Loopy BP)
• Iteratively estimate the “beliefs” about vertices– Read in messages– Updates marginal
estimate (belief)– Send updated
out messages• Repeat for all variables
until convergence
34http://www.merl.com/papers/docs/TR2001-22.pdf
![Page 34: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/34.jpg)
Bulk Synchronous Loopy BP
• Often considered embarrassingly parallel – Associate processor
with each vertex– Receive all messages– Update all beliefs– Send all messages
• Proposed by:– Brunton et al. CRV’06– Mendiburu et al. GECC’07– Kang,et al. LDMTA’10– …
35
![Page 35: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/35.jpg)
Sequential Computational Structure
36
![Page 36: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/36.jpg)
Hidden Sequential Structure
37
![Page 37: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/37.jpg)
Hidden Sequential Structure
• Running Time:
EvidenceEvidence
Time for a singleparallel iteration
Number of Iterations
38
![Page 38: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/38.jpg)
Optimal Sequential Algorithm
Forward-Backward
Bulk Synchronous
2n2/p
p ≤ 2n
RunningTime
2n
Gap
p = 1
Optimal Parallel
n
p = 2 39
![Page 39: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/39.jpg)
The Splash Operation• Generalize the optimal chain algorithm:
to arbitrary cyclic graphs:
~
1) Grow a BFS Spanning tree with fixed size
2) Forward Pass computing all messages at each vertex
3) Backward Pass computing all messages at each vertex
40http://www.select.cs.cmu.edu/publications/paperdir/aistats2009-gonzalez-low-guestrin.pdf
![Page 40: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/40.jpg)
BSP is Provably Inefficient
Limitations of bulk synchronous model can lead to provably inefficient parallel algorithms
1 2 3 4 5 6 7 80
100020003000400050006000700080009000
Number of CPUs
Runti
me
in S
econ
ds
Bulk Synchronous (Pregel)
Asynchronous Splash BP
BS
PS
pla
sh B
P
Gap
![Page 41: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/41.jpg)
Tradeoffs of the BSP Model
• Pros:– Graph Parallel– Relatively easy to build– Deterministic execution
• Cons:– Doesn’t exploit the graph structure– Can lead to inefficient systems– Can lead to inefficient computation– Can lead to invalid computation
![Page 42: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/42.jpg)
43
The problem with Bulk Synchronous Gibbs Sampling
• Adjacent variables cannot be sampled simultaneously.
Strong PositiveCorrelation
t=0
Parallel Execution
t=2 t=3
Strong PositiveCorrelation
t=1
Sequential
Execution
Strong NegativeCorrelation
![Page 43: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/43.jpg)
BeliefPropagationSVM
KernelMethods
Deep BeliefNetworks
NeuralNetworks
Tensor Factorization
PageRank
Lasso
The Need for a New AbstractionIf not Pregel, then what?
Data-Parallel Graph-Parallel
CrossValidation
Feature Extraction
Map Reduce
Computing SufficientStatistics
Pregel (Giraph)
![Page 44: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/44.jpg)
GraphLab Addresses the Limitations of the BSP Model
• Use graph structure – Automatically manage the movement of data
• Focus on Asynchrony – Computation runs as resources become available– Use the most recent information
• Support Adaptive/Intelligent Scheduling– Focus computation to where it is needed
• Preserve Serializability– Provide the illusion of a sequential execution– Eliminate “race-conditions”
![Page 46: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/46.jpg)
The GraphLab Framework
Scheduler Consistency Model
Graph BasedData Representation
Update FunctionsUser Computation
![Page 47: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/47.jpg)
Data GraphA graph with arbitrary data (C++ Objects) associated with each vertex and edge.
Vertex Data:• User profile text• Current interests estimates
Edge Data:• Similarity weights
Graph:• Social Network
![Page 48: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/48.jpg)
Implementing the Data Graph
• All data and structure is stored in memory– Supports fast random lookup needed for dynamic
computation• Multicore Setting:
– Challenge: Fast lookup, low overhead– Solution: dense data-structures
• Distributed Setting:– Challenge: Graph partitioning– Solutions: ParMETIS and Random placement
![Page 49: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/49.jpg)
Natural graphs have poor edge separatorsClassic graph partitioning tools (e.g., ParMetis, Zoltan …) fail
Natural graphs have good vertex separators
New Perspective on Partitioning
CPU 1 CPU 2
YY Must synchronize
many edges
CPU 1 CPU 2
Y Y Must synchronize a single vertex
![Page 50: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/50.jpg)
pagerank(i, scope){ // Get Neighborhood data (R[i], Wij, R[j]) scope;
// Update the vertex data
// Reschedule Neighbors if needed if R[i] changes then reschedule_neighbors_of(i); }
;][)1(][][
iNj
ji jRWiR
Update FunctionsAn update function is a user defined program which when applied to a vertex transforms the data in the scope of the vertex
![Page 51: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/51.jpg)
PageRank in GraphLab (V2)
struct pagerank : public iupdate_functor<graph, pagerank> {
void operator()(icontext_type& context) {double sum = 0;foreach ( edge_type edge,
context.in_edges() )sum +=
1/context.num_out_edges(edge.source()) *
context.vertex_data(edge.source());double& rank = context.vertex_data(); double old_rank = rank;rank = RESET_PROB + (1-RESET_PROB) * sum;double residual = abs(rank – old_rank)if (residual > EPSILON)
context.reschedule_out_neighbors(pagerank());}
};
![Page 52: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/52.jpg)
Converged Slowly ConvergingFocus Effort
Dynamic Computation
54
![Page 53: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/53.jpg)
The Scheduler
CPU 1
CPU 2
The scheduler determines the order that vertices are updated
e f g
kjih
dcba b
ih
a
i
b e f
j
c
Sch
edule
r
The process repeats until the scheduler is empty
![Page 54: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/54.jpg)
Choosing a Schedule
GraphLab provides several different schedulersRound Robin: vertices are updated in a fixed orderFIFO: Vertices are updated in the order they are addedPriority: Vertices are updated in priority order
56
The choice of schedule affects the correctness and parallel performance of the algorithm
Obtain different algorithms by simply changing a flag! --scheduler=sweep --scheduler=fifo --scheduler=priority
![Page 55: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/55.jpg)
The GraphLab Framework
Scheduler Consistency Model
Graph BasedData Representation
Update FunctionsUser Computation
![Page 56: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/56.jpg)
Ensuring Race-Free ExecutionHow much can computation overlap?
![Page 57: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/57.jpg)
GraphLab Ensures Sequential Consistency
For each parallel execution, there exists a sequential execution of update functions which produces the same result.
CPU 1
CPU 2
SingleCPU
Parallel
Sequential
time
![Page 58: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/58.jpg)
Consistency Rules
60
Guaranteed sequential consistency for all update functions
Data
![Page 59: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/59.jpg)
Full Consistency
61
![Page 60: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/60.jpg)
Obtaining More Parallelism
62
![Page 61: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/61.jpg)
Edge Consistency
63
CPU 1 CPU 2
Safe
Read
![Page 62: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/62.jpg)
Consistency Through SchedulingEdge Consistency Model:
Two vertices can be Updated simultaneously if they do not share an edge.
Graph Coloring:Two vertices can be assigned the same color if they do not share an edge.
Barr
ier
Phase 1
Barr
ier
Phase 2
Barr
ier
Phase 3Ex
ecut
e in
Par
alle
lSy
nchr
onou
sly
![Page 63: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/63.jpg)
Consistency Through R/W LocksRead/Write locks:
Full Consistency
Edge Consistency
Write Write WriteCanonical Lock Ordering
Read Write ReadRead Write
![Page 64: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/64.jpg)
The GraphLab Framework
Scheduler Consistency Model
Graph BasedData Representation
Update FunctionsUser Computation
![Page 65: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/65.jpg)
Bayesian Tensor Factorization
Dynamic Block Gibbs Sampling
MatrixFactorization
Lasso
SVM
Belief PropagationPageRank
CoEM
K-Means
SVD
LDA
…Many others…Linear Solvers
Splash SamplerAlternating Least
Squares
![Page 66: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/66.jpg)
Startups Using GraphLab
Companies experimenting (or downloading) with GraphLab
Academic projects exploring (or downloading) GraphLab
![Page 67: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/67.jpg)
GraphLab vs. Pregel (BSP)
PageRank (25M Vertices, 355M Edges)
0 10 20 30 40 50 60 701
100
10000
1000000
100000000
Number of Updates
Num
-Ver
tices 51% updated only once
![Page 68: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/68.jpg)
CoEM (Rosie Jones, 2005)Named Entity Recognition Task
the dog
Australia
Catalina Island
<X> ran quickly
travelled to <X>
<X> is pleasant
Hadoop 95 Cores 7.5 hrs
Is “Dog” an animal?Is “Catalina” a place?
Vertices: 2 MillionEdges: 200 Million
![Page 69: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/69.jpg)
0 2 4 6 8 10 12 14 160
2
4
6
8
10
12
14
16
Number of CPUs
Spee
dup
Bett
er
Optimal
GraphLab CoEM
CoEM (Rosie Jones, 2005)
72
GraphLab 16 Cores 30 min
15x Faster!6x fewer CPUs!
Hadoop 95 Cores 7.5 hrs
![Page 70: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/70.jpg)
CoEM (Rosie Jones, 2005)
Optimal
0 2 4 6 8 10 12 14 160
2
4
6
8
10
12
14
16
Number of CPUs
Spee
dup
Bett
er
Small
Large
GraphLab 16 Cores 30 min
Hadoop 95 Cores 7.5 hrs
GraphLabin the Cloud
32 EC2 machines
80 secs
0.3% of Hadoop time
![Page 71: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/71.jpg)
The Cost of the Wrong AbstractionLo
g-Sc
ale!
![Page 72: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/72.jpg)
Tradeoffs of GraphLab• Pros:
– Separates algorithm from movement of data– Permits dynamic asynchronous scheduling– More expressive consistency model– Faster and more efficient runtime performance
• Cons:– Non-deterministic execution– Defining “residual” can be tricky– Substantially more complicated to implement
![Page 73: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/73.jpg)
Scalability and Fault-Tolerance in Graph Computation
![Page 74: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/74.jpg)
Scalability
• MapReduce: Data Parallel– Map heavy jobs scale VERY WELL– Reduce places some pressure on the networks– Typically Disk/Computation bound– Favors: Horizontal Scaling (i.e., Big Clusters)
• Pregel/GraphLab: Graph Parallel– Iterative communication can be network intensive– Network latency/throughput become the
bottleneck– Favors: Vertical Scaling (i.e., Faster networks and
Stronger machines)
![Page 75: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/75.jpg)
Cost-Time Tradeoff
video co-segmentation results
more machines, higher cost
fast
er
a few machines helps a lot
diminishingreturns
![Page 76: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/76.jpg)
Video Cosegmentation
Segments mean the same
Model: 10.5 million nodes, 31 million edges
Gaussian EM clustering + BP on 3D grid
Video version of [Batra]
![Page 77: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/77.jpg)
Video Coseg. Strong Scaling
GraphLab
Ideal
![Page 78: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/78.jpg)
Video Coseg. Weak Scaling
GraphLab
Ideal
![Page 79: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/79.jpg)
Fault Tolerance
![Page 80: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/80.jpg)
Rely on Checkpoint
Barrie
r
Compute Communicate
Checkp
oin
t
Pregel (BSP) GraphLab
Synchronous Checkpoint Construction
Asynchronous Checkpoint Construction
![Page 81: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/81.jpg)
Checkpoint Interval
• Tradeoff: – Short Ti: Checkpoints become too costly
– Long Ti : Failures become too costly
Time
Re-compute
Checkpoint
Checkpoint Interval: Ti
Checkpoint Length: Ts Machine Failure
Time Checkpoint CheckpointCheckpoint Checkpoint Checkpoint
Time Checkpoint
![Page 82: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/82.jpg)
Optimal Checkpoint Intervals • Construct a first order approximation:
• Example:– 64 machines with a per machine MTBF of 1 year
• Tmtbf = 1 year / 64 ≈ 130 Hours
– Tc = of 4 minutes
– Ti ≈ of 4 hours From: http://dl.acm.org/citation.cfm?id=361115
CheckpointInterval
Length ofCheckpoint
Mean timebetween failures
![Page 83: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/83.jpg)
Open Challenges
![Page 84: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/84.jpg)
Dynamically Changing Graphs
• Example: Social Networks– New users New Vertices– New Friends New Edges
• How do you adaptively maintain computation:– Trigger computation with changes in the graph– Update “interest estimates” only where needed– Exploit asynchrony – Preserve consistency
![Page 85: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/85.jpg)
Graph Partitioning
• How can you quickly place a large data-graph in a distributed environment:
• Edge separators fail on large power-law graphs– Social networks, Recommender Systems, NLP
• Constructing vertex separators at scale:– No large-scale tools!– How can you adapt the placement in changing
graphs?
![Page 86: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/86.jpg)
Graph Simplification for Computation
• Can you construct a “sub-graph” that can be used as a proxy for graph computation?
• See Paper:– Filtering: a method for solving graph problems in
MapReduce.• http://research.google.com/pubs/pub37240.html
![Page 87: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/87.jpg)
Concluding BIG Ideas• Modeling Trend: Independent Data Dependent Data
– Extract more signal from noisy structured data• Graphs model data dependencies
– Captures locality and communication patterns• Data-Parallel tools not well suited to Graph Parallel problems• Compared several Graph Parallel Tools:
– Pregel / BSP Models: • Easy to Build, Deterministic• Suffers from several key inefficiencies
– GraphLab: • Fast, efficient, and expressive • Introduces non-determinism
• Scaling and Fault Tolerance: – Network bottlenecks and Optimal Checkpoint intervals
• Open Challenges: Enormous Industrial Interest
![Page 88: Big Learning with Graph Computation Joseph Gonzalez jegonzal@cs.cmu.edu Download the talk: //tinyurl.com/7tojdmw jegonzal/talks/biglearning_with_graphs.pptx.](https://reader035.fdocuments.net/reader035/viewer/2022062221/56649db95503460f94aa93c4/html5/thumbnails/88.jpg)