SDPA:Leading-edge Software for SDP
2008/10/14 @ Informs ’08
Tokyo Institute of Technology Makoto YamashitaMituhiro FukudaMasakazu KojimaKazuhide Nakata
Chuo University Katsuki Fujisawa
National Maritime Research Institute Kazuhiro Kobayashi
RIKEN Maho Nakata
2
SDPA (SemiDefinite Programming Algorithm) Project
Open Source Software to solveSemiDefinite Programming
Since the 1st release in 1995, it has kept high quality
In 2008, the latest version SDPA 7 was released and has been updated continuously
Many software for more advantage
3
SDPA Family
SDPA
SDPARA(Parallel with MPI)
SDPA-C(Matrix Completion)
SDPA-M(Matlab Interface)
SDPA-GMP(Multiple Precision)
SDPARA-C
accessible onSDPA Online Solveras Web service
4
Outline of this talk
1. SDP and the improvements of SDPA72. Parallel with MPI3. Multiple Precision4. Online Solver5. Future Works
6
Standard form of SDP
7
Applications of SDP
Control Theory Lyapnov condition
Combinatorial Optimization Max Cut Theta function
Quantum Chemistry Reduced Density Matrix
8
Primal-Dual Interior-Point Methods
9
Computation for Search Direction
Schur complement matrix ⇒ Cholesky Factorizaiton
Exploitation of Sparsity in
11
SDPA 7
SDPA7 resolves bottlenecks of SDPA6 Introduce sparse Cholesky factorization
for the Schur complement matrix Adopt new data structure Reduce memory space for temporary
variables Introduce configure script for easier
installation
12
Sparsity pattern of Schur complement matrix
Fully dense Schur complement matrixFully dense Schur complement matrixSparse Schur complement matrixSparse Schur complement matrixminimum degree ordering minimum degree ordering to minimize to minimize the number of fill-in the number of fill-in
13
New Data Structure For multiple diagonal
structure SDPA6
stores nonzero elements of each block
stores all blocks SDPA 7
stores nonzero elements of each block
stores only nonzero blocks
14
r2S_broydenTri300.dat-s
SDPA 6 SDPA 7
Computing B 1009.0s 0.8s
Cholesky B 5179.5s 0.5s
Total CPU Time 6204.4s 2.8s
Input data 90MB 1MB
Matrix B 272MB 4MB
Dense Matrix 7MB 3MB
Total Memory 380MB 19MB
Sparse Schur
New Data Structure
Efficient Temporary
Xeon 2.80GHz, 2GB memory, Linux 2.4
15
Configure script
Easier installation$./configure –with-blas=”-lblas”
–with-lapack=”-llapack”$ make$ make install
We can link with Optimized BLAS, i.e., ATLAS, GotoBLAS, Intel MKL
16
Matlab Interface
SDPA-M is the Matlab interface [mDIM,nBLOCK,bLOCKsTRUCT,c,F] =
read_data(’example1.dat-s’); [objVal,x,X,Y,INFO] =
sdpam(mDIM,nBLOCK,bLOCKsTRUCT,c,F,OPTION);
SeDuMi Input interface [At,b,c,K] = fromsdpa(’example1.dat-s’); [x,y,info] = sedumiwrap(At,b,c,K,[],pars);
Current version is for only LP and SDP cones Parameter control is based on SDPA
17
Extremely Large Problem and Bottlenecks
The largest size requires 8.6GB memory
We replace these two bottlenecks by parallel computation
Opteron 246 2.0GHz6GB Memory
18
Exploitation of Sparsityin SDPA
We change the formula by row-wise
We keep this scheme on parallel computation
F1
F2
F3
19
Row-wise distribution for evaluation of the Schur complement matrix
4 CPU is availableEach CPU computes only their assigned rows
. No communication between CPUsEfficient memory management
20
Parallel Cholesky factorization We adopt Scalapack for the Cholesky
factorization of the Schur complement matrix We redistribute the matrix from row-wise to
two-dimensional block-cyclic-distribtuion
Redistribution
21
Computation Time for NH3
93883235
1077460
9150
2973
796
201
88
239.4
3.71
10
100
1000
10000
1 4 16 64#processors
seco
nd TOTALComp BChol B
TSUBAME@Tokyo-TechOpteron 880 (2.4GHz)32GB memory/node
22
Scalability for LiF
1
10
100
1 4 16 64#processors
scal
abili
ty TOTALComp BChol B
Total 28 times
Comp B 43 times
Chol B 46 times
Row-wise distribution for Comp B is very effective
TSUBAME@Tokyo-TechOpteron 880 (2.4GHz)32GB memory/node
26
Multiple Precision SDPA uses ‘double’ precision
53 significant bit almost 8 digit
SDPA 7 result (gpp124-1 from SDPLIB) Objective Function (Only 5 digits)
-7.3430761748645921e+00(Primal)-7.3430800814821620e+00(Dual)
Feasibility5.45696821063e-12 (Primal)1.68252292320e-07 (Dual)
Some applications requires more accuracy
27
SDPA-GMP
GMP: Gnu Multiple Precision Library Arbitrary fixed precision ‘double’ precision is replaced by GMP Ultra High Accuracy
by long computation time
28
Ultra Accuracy of SDPA-GMP
29
Comparison on SDPA and SDPA-GMP(384bit)
gpp124-1(SDPLIB)
SDPA-GMP(7.1.0)Relative gap 1.7163710368162993e-26Objective Function -7.3430762652465377e+00 (Primal) -7.3430762652465377e+00 (Dual)Feasibility 2.0710194844721e-57 (Primal) 1.2329417039702e-29 (Dual)Computation time 228.95 sec. (59 iterations)
SDPA(7.1.0)Relative gap 5.3201361904260111e-07Objective Function -7.3430761748645921e+00 (Primal) -7.3430800814821620e+00 (Dual)Feasibility 5.45696821063e-12 (Primal) 1.68252292320e-07 (Dual)Computation time 0.14 sec. (20 iterations)
30
SDPA Online Solver
SDPA Online Solver will offer SDPA/SDPARA/SDPARA-C via the Internet.
Internet
InterfaceUser1.Input 2.Ninf-G
3.SDPARA on PC cluster
4.Solution
31
To use Online Solver Users without parallel environment can
use SDPARA/SDPARA-C. No Charge. Registration via the Internet is required
so that passwords to protect users data will be generated automatically.
Access SDPA Project Home Page.[SDPA Online for your future.]http://sdpa.indsys.chuo-u.ac.jp/sdpa/
32
Online Solver Interface
33
Online Solver Usage
34
Conclusion The latest version 7 attains higher
performance than version 6 Parallel Solver enables us to solve
extremely large SDPs Matrix Completion is useful for Structural
Sparsity SDPA-GMP generates ultra high accuracy
solution Online Solver provides powerful
computation resources via the Internet
35
Future works
Callable Library of SDPA7 Automatic Selection from
SDPA/SDPARA/SDPARA-C
Top Related