Is Your Graph Algorithm Eligible for Nondeterministic Execution? Zhiyuan Shao, Lin Hou, Yan Ai, Yu...
-
Upload
christian-higgins -
Category
Documents
-
view
214 -
download
0
Transcript of Is Your Graph Algorithm Eligible for Nondeterministic Execution? Zhiyuan Shao, Lin Hou, Yan Ai, Yu...
Is Your Graph Algorithm Eligible for Nondeterministic Execution?
Zhiyuan Shao, Lin Hou, Yan Ai, Yu Zhang and Hai Jin
Services Computing Technology and System LabCluster and Grid Computing Lab
Huazhong University of Science and Technology
ICPP’15
Outline
• Motivation• System model• Algorithm Convergence• Evaluation • Conclusion
Motivation
• “Big data” era• Loosely coupled data
• Key-value pairs• Hadoop, Spark, many others
• Tightly coupled data• Graph data• Pregel, GraphLab, GraphChi, X-Stream, many others
• Graph computing• Execution model
• Synchronous model (BSP)• Asynchronous model
• Execution manner• Deterministic executions• Nondeterministic executions
Motivation (Cont’d)
• Deterministic execution• Widely and extensively studied• Architecture, OS, Scheduling• Set/Chromatic scheduler (GraphLab), DIG (Galois), external deterministic
(GraphChi)• Pros.
• Deterministic execution path (always) leads to deterministic results• Cons.
• High overhead introduced to order the tasks (consider a billion-node graph!)
• Nondeterministic execution• Poorly studied• Pros.
• High parallelism, High performance!• Cons.
• Need to prevent (at least) data-races• Un-documented
Motivation (Cont’d)
• Example of two execution manners
Problem: High overhead for defining the execution sequence!Question: What if all these tasks are executed nondeterministically?A1: Obviously, Avoided ordering overhead and improved parallelism!A2: Data-races on edges!
Taken from GraphLab paper
But what if we eliminate the data-races?
Motivation (Cont’d)
• Objective of this research• Study the nondeterministic execution of graph
algorithms• Wait…… Why to study that?• Graph algorithms are special cases of parallel computing!
• Iterative computing• Associative law: a+(b+c) = (a+b)+c• Idempotent law: f(f(x)) = f(x)
• Potential towards higher performance!• Questions:
• Will an algorithm converge by nondeterministic executions?• Will the executions lead to deterministic results (i.e., external
deterministic)?
Outline
• Motivation• System model• Algorithm Convergence• Evaluation • Conclusion
System model
• Share memory computer• # processors >= 1• Graphs loaded in memory• COTS components, nothing special for HW and OS
• Synchronous implementation of asynchronous model• Computing is organized by multiple iterations• Barriers are enforced between two consecutive iterations• Updates are applied “immediately”• Example: GraphChi, GRACE
• Vertex-centric computing• “Think Like A Vertex”• Data-dependences happen on edges
System model (Cont’d)• Race-free• Method1: Architecture support• Method2: Compiler support• Method3: Explicit lock/unlock• Convert data-races to “conflicts”
• Scheduling• General methods• Example: static, dynamic or other methods in OpenMP• Assumption on scheduling
Dv Duv uDeDe
add_schedule(u)
Outline
• Motivation• System model• Algorithm Convergence• Evaluation • Conclusion
Algorithm Convergence
• Methodology• Classify the “conflicts” on edges• Read-write conflicts
• Case1: Read-after-write read new value converge• Case2: Write-after-read read old value converge?
• Write-write conflicts• Case1: (correct)write-after-(wrong)write correct edge values
converge• Case2: (wrong)write-after-(correct)write corruption edge
values converge?
Algorithm Convergence (Cont’d)
• Read-write conflict
Dv Duv uDeDe
Case1: Read-after-write
Converge
Dv Duv uDeDe
Case2: Write-after-read
Dv Duv u
Read old value
Next iteration ConvergeDeDe
Algorithm Convergence (Cont’d)
• Sufficient condition1 to convergence• Chain-to-converge exists
• Deduction1: If algorithm A on graph G converges with synchronous model execution, A will converge with nondeterministic execution.
• Deduction2: If algorithm A on graph G converge by a deterministic scheduler of asynchronous mode, A will converge with nondeterministic execution.
• Example algorithms that converge: • PageRank• Many other fixed point iterative algorithms
Algorithm Convergence (Cont’d)
• Write-write conflicts
Dv Duv uDeDe
Case1: (correct)write-after-(wrong)write
De Converge
Dv Duv uDeDe
Case2: (wrong)write-after-(correct)write
De
Dv Duv uDe De
Corrupted edge value
Next iteration De
Falsely converge
Correcting edge value
Dv Duv uDeNext iteration Converge
Algorithm Convergence (Cont’d)
• Sufficient condition2 to convergence• In order to correct the corrupted edge value:
• Algorithm A on graph G converges with deterministic asynchronous model execution.
• Algorithm A satisfies monotonicity property. (falsely converge)• Algorithms that converge:
• WCC (Weakly Connected Components) by MLP (Minimal Label Propagation)
• BFS (Breadth First Search)• Many other graph traversal algorithms
• Algorithms that does not converge: • BP (Belief Propagation)
Outline
• Motivation• System model• Algorithm Convergence• Evaluation • Conclusion
Evaluation
• Experiment setup• 2*2.6-GHz Intel Xeon E5-2670 processors (8 cores)• 64GB of RAM• GCC version: 4.8.3
• Real-world graph data-sets• Web-BerkStan, web-Google, soc-LiveJournal1, cage15
• Platform• GraphChi (C++ version 0.2)
• Algorithms• PageRank, SSSP, WCC, BFS
Avail at: https://github.com/mrshawcode/GraphChi_nondeter_algorithm
Evaluation (Cont’d)
• Using architecture support achieves best performance (exec. time reduction can be up to 70%)
• Using explicit locking/unlocking achieves not the best performance, but still good scalability, and sometimes even outperform deterministic executions.
Evaluation (Cont’d)
difference degree is 3 Result1:{1, 2, 3, 5, 7}Result2:{1, 2, 3, 7, 5}Suffix---- 0, 1, 2, 3, 4
• Results are not deterministic (external deterministic)• With increased precision (smaller ε), variations in results move to
less important pages
• How about the produced results of PageRank?
Measure the difference:
Outline
• Motivation• System model• Algorithm Convergence• Evaluation • Conclusion
Conclusion
• Conclusion• Graph algorithms are special cases of parallel computing
• Does not necessarily need high overhead deterministic executions!• Most of the algorithms can be executed nondeterministically
• Examples include PageRank, WCC, BFS and many others.• Not all of the nondeterministic executions produce deterministic
results!
• Open problems• More discussions on sufficient conditions for algorithm convergence by
nondeterministic execution• More discussions on the variations (nondeterminacy) in results
produced by nondeterministic executions (e.g., PageRank)• Theoretical analysis on speed of convergence• Extending the system model to pure asynchronous computing
Thank you!Q&A