SYSC5906 - Directed Studies (Distributed Sparse Matrices) · 2017-02-01 · Givens Rotations [a b...
Transcript of SYSC5906 - Directed Studies (Distributed Sparse Matrices) · 2017-02-01 · Givens Rotations [a b...
SYSC5906 - Directed Studies
(Distributed Sparse Matrices)
A Report
Alistair Boyle 2010
http://creativecommons.org/licenses/by-nc-sa/3.0/
Title image: http://www.flickr.com/photos/8702301@N06/5006243147/
Overview
Storage
Solution
Ordering
Distribution
[http://www.cise.ufl.edu/research/sparse/matrices/Rothberg/gearbox.html, 107624 nodes, 3250488 edges, UF Sparse Matrix Collection]
Motivation
● Linear Equations● Exact solutions● Approximations
● Eigenvalues/vectors● Vibration● Harmonics
Numerical Simulations
[http://www.dfrc.nasa.gov/Gallery/Photo/X-43A/HTML/ED97-43968-1.html]
Sparse Storage
Triplets List-of-Lists CSR
Sparse StorageCompressed Sparse Row format
Sparse StorageFile Formats
● Matrix Market● Triplets format, ASCII
● Harwell-Boeing● Compressed Sparse Column (CSC) format storage● Assembled, elemental, real, complex, pattern matrices● Support for multiple right-hand sides, guesses, solutions
● Rutherford-Boeing● An updated version of the Harwell-Boeing format● Supplementary matrix information
– Orderings, estimates, partitions, Laplacian values, geometry, etc.
[The Rutherford-Boeing Sparse Matrix Collection, Duff, Grimes, Lewis, Rutherford Appleton Laboratory, 1997]
SolutionDecompositions
Involves converting matrices to triangular or diagonal form through matrix transformations
?
● If A is square:
● L is unit lower triangular, U is upper triangular
SolutionLU Decomposition
A x=b ⇒ x ?
A=LUL y=b ⇒ yU x= y ⇒ x
1. decomposition
2. forward substitution
3. backward substitution
(Crout or Doolittle algorithms)
[Gene H. Golub and Charles F. van Loan: Matrix Computations, 2nd ed., The John Hopkins University Press, 1989]
SolutionCholesky Decomposition
● If A is square, hermitian, positive definite:
● Special case of LU decomposition where
A x=b ⇒ x ?
A=L LH
L y=b ⇒ yLH x= y ⇒ x
1. decomposition
2. forward substitution
3. backward substitution
U=LH
(Choleksy-Crout/Banachiewicz algorithms)
[Gene H. Golub and Charles F. van Loan: Matrix Computations, 2nd ed., The John Hopkins University Press, 1989]
argmin∥A x−b∥22⇒ x?
A=Q R ⇒ QH A=QHQ R=R
∥QH A x−QH b∥22=∥Rx−c1
−c2∥2
2
... c=QH b , Rx=c1 ⇒ x
SolutionQR Decomposition
● If A is square, nonsingular:
● Q is orthogonal/unitary, R is upper triangular
1. decomposition (Gram-Schmidt, Givens, Householder algorithms)
(Least squares solution)
[Gene H. Golub and Charles F. van Loan: Matrix Computations, 2nd ed., The John Hopkins University Press, 1989]
QHQ= I ,Q−1=QH
SolutionOther Decompositions
Eigenproblems: Schur factorization
Singular Value Decompositions
Solution: QR factorizationGivens Rotations
● Rotation of a point about a line
Solution: QR factorizationGivens Rotations
[a b−b a ][ xy ]=[m0 ]m= x2 y2
a=x /rb= y /r
Gn=[1 0 0 0 00 a 0 b 00 0 1 0 00 −b 0 a 00 0 0 0 1
]Example
Only four entries in the transformed matrix are modified for a single Givens rotation. Result is one
entry in the transformed matrix becoming zero.
[Edward Anderson, Discontinuous Plane Rotations and the Symmetric Eigenvalue Problem, 2000, LAPACK Working Note 150,http://www.netlib.org/lapack/lawnspdf/lawn150.pdf ]
[Gene H. Golub and Charles F. van Loan: Matrix Computations, 2nd ed., The John Hopkins University Press, 1989]
Solution: QR factorizationHouseholder Reflections
● Reflection of points across a hyper-plane
Solution: QR factorizationHouseholder Reflections
A=[a11 a12 ...a21 a22 ...a31 a32 ...]
=∥x∥, x=[a11a21a31]T
e1=[100 ...]T
u=x− e1
v=u /∥u∥Q=I−2 v vT
A'=Q A=[ a12 ' ...0 a22 ...0 a32 ...]
Removes column at-a-time. Operates on submatrices as upper triangular matrix is produced.
[Gene H. Golub and Charles F. van Loan: Matrix Computations, 2nd ed., The John Hopkins University Press, 1989]
Fill-in
● As a matrix is decomposed, it may fill-in: previously zero entries become non-zero, making more work for the later stages.
[figure from: Minimum Degree Reordering Algorithms: A Tutorial, Stephen Ingram, [email protected]]
First 4 steps in Cholesky algorithmBlue: initially non-zero,
Red: fill-in
Fill-in
● Reordering the first row & column: massive improvement in required operations
● But... finding a “good” ordering is NP-hard
[figure from: Minimum Degree Reordering Algorithms: A Tutorial, Stephen Ingram, [email protected]]
Reordered, First 4 steps in Cholesky algorithmBlue: initially non-zero,
Red: fill-in (none)
OrderingMinimum Degree
[An Approximate Minimum Degree Ordering Algorithm,Patrick R. Amestoy, Timothy A. Davisy, Iain S. Duz, SIAM J. Matrix Analysis & Applic., Vol 17, no 4, pp. 886-905, Dec. 1996]
OrderingsNested Dissection
● Divide and Conquer● Recursively split based on mutual independence
[Direct methods for sparse linear systems, Davis, 2006, SIAM]
A=[a11 a12 0a21 a22 a23
0 a32 a33] A'=[
a11 0 a13 '0 a33 a23 'a31 ' a32 ' a33 '
]
Ordering
● MD, MMD, AMD● PORD
● UMFPACK
Hybrid● METIS● SCOTCH
Orderingparallel ordering
ParMETIS
pt-SCOTCH
Hybrid
minimum degree
nested-dissection
[ ParMETIS - http://glaros.dtc.umn.edu/gkhome/metis/parmetis/overview]
[ ptSCOTCH - http://www.labri.fr/perso/pelegrin/scotch/ ]
DistributionLeft-Looking
[An evaluation of Left-Looking, Right-Looking and Multifrontal Approaches to Sparse Cholesky Facotrizations on Heirarchial-Memory Machines, Rothberg, Gupta, Technical Report, 1991]
● Two operations– Divide column by sqrt of its diagonal– Add a multiple of one column to another
● Column-based● iterates on columns to the left of the current column
● Save updates until a column is completed
DistributionRight-Looking
[An evaluation of Left-Looking, Right-Looking and Multifrontal Approaches to Sparse Cholesky Facotrizations on Heirarchial-Memory Machines, Rothberg, Gupta, Technical Report, 1991]
● Two operations– Divide column by sqrt of its diagonal– Add a multiple of one column to another
● “Submatrix”-based● Iterates on columns to the right of the current
column
● Requires (inexpensive) search on destination's storage to find new non-zero locations to insert
DistributionMultifrontal (Right-Looking)
A'=[a11 0 a13 '0 a33 a23 'a31 ' a32 ' a33 '
]a11 a33
arem
Form a tree and solve independent portions.Updates to the matrix occur at the “front”.
Updates are kept on a stack – typically +25% space.Can have multiple independent fronts in parallel.
Can take advantage of “super-nodes”
Solution front
[Direct methods for sparse linear systems, Davis, 2006, SIAM]
Solving in Serial
● CHOLMOD● MA57● MA41● MA42● MA67● MA48● Oblio
● SPARSE● SPARSPAK● SPOOLES● SuperLLT● SuperLU● UMFPACK
[Direct Solvers for Sparse Matrices, X.Li March 2010]
symmetric
+ symmetric positive definite
+
+
+
non-symmetric
Legend
Solving in Parallelshared memory
● BCSLIB-EXT● Cholesky● DMF● MA41● MA49● PanelLLT● PARASPAR
[Direct Solvers for Sparse Matrices, X.Li March 2010]
● PARADISO● SPOOLES● SuiteSparseQR● SuperLU_MT● TAUCS● WSMP
symmetric
+ symmetric positive definite
non-symmetric
Legend
+
+ +
Solving in Paralleldistributed
● DMF● DSCPACK● MUMPS● PaStiX● PSPASES● SPOOLES● SuperLU_DIST● S+● WSMP
[Direct Solvers for Sparse Matrices, X.Li March 2010]
symmetric
+ symmetric positive definite
non-symmetric
Legend
+
+
+
+
[A fully asynchronous multifrontal solver using distributed dynamic scheduling, PR Amestoy, IS Duff, JY L'Excellent, J, SIAM Journal on Matrix, 2002]
+
Questions?
[http://www.flickr.com/photos/takomabibelot/4164289232/]