Parallel Computation for SDPs

footballsyrupΛογισμικό & κατασκευή λογ/κού

1 Δεκ 2013 (πριν από 3 χρόνια και 11 μήνες)

104 εμφανίσεις

INFOMRS 2011 @ Charlotte

1

Parallel Computation for SDPs
Focusing on the Sparsity of
Schur Complements Matrices

Makoto Yamashita @ Tokyo Tech

Katsuki Fujisawa @ Chuo Univ

Mituhiro Fukuda @ Tokyo Tech

Kazuhide Nakata @ Tokyo Tech

Maho Nakata @ RIKEN

INFORMS Annual Meeting @ Charlotte 2011/11/15

(2011/11/13
-
2011/11/16)

INFOMRS 2011 @ Charlotte

2

Key phrase


SDPARA:

The fastest solver


for large SDPs

available at http://sdpa.sf.net/

S
emi
D
efinite
P
rogramming
A
lgorithm


pa
RA
llel veresion

INFOMRS 2011 @ Charlotte

3

SDPA Online Solver

1.
Log
-
in the online
solver

2.
Upload your
problem

3.
Push

Execute


button

4.
Receive the result
via Web/Mail

http://sdpa.sf.net/


Online Solver

INFOMRS 2011 @ Charlotte

4

Outline

1.
SDP applications

2.
Standard form and


Primal
-
Dual Interior
-
Point Methods

3.
Inside of SDPARA

4.
Numerical Results

5.
Conclusion

INFOMRS 2011 @ Charlotte

5

SDP Applications

1.Control theory


Against swing,

we want to keep
stability.



Stability Condition



Lyapnov Condition



SDP

INFOMRS 2011 @ Charlotte

6


Ground state energy


Locate electrons



Schrodinger Equation


Reduced Density Matrix


SDP

SDP Applications

2. Quantum Chemistry

INFOMRS 2011 @ Charlotte

7

SDP Applications

3. Sensor Network Localization


Distance
Information


Sensor
Locations



Protein

Structure

INFOMRS 2011 @ Charlotte

8

Standard form


The variables are


Inner Product is


The size is roughly determined by











m
k
k
k
m
k
k
k
k
k
O
Y
C
Y
z
A
z
b
D
O
X
m
k
b
X
A
X
C
P
1
1
,
s.t.
max
)
(
),
,
,
1
(
s.t.
min
)
(







m
n
n
R
S
S
z
Y
X
,
,
,
,





n
j
i
ij
ij
Y
X
Y
X
1
,
Y
X
n
P
m

and


of

size

the
)
(
in

s
constraint
equality

of
number

the
Our target

000
,
30

m
INFOMRS 2011 @ Charlotte

9

Primal
-
Dual Interior
-
Point Methods


Feasible region





m
n
n
R
S
S
z
Y
X
,
,
,
,



*
*
*
,
,
z
Y
X
Optimal

Central Path



0
0
0
,
,
z
Y
X
)
,
,
(
dz
dY
dX
Target



1
1
1
,
,
z
Y
X


2
2
2
,
,
z
Y
X
INFOMRS 2011 @ Charlotte

10

Schur Complement Matrix























2
/
,
1
1
T
m
j
j
j
dX
dX
dX
Y
XdY
R
dX
dz
A
D
dY
r
Bdz


j
i
ij
A
Y
XA
B



1
where

Schur Complement Equation

Schur Complement Matrix

1. ELEMENTS (Evaluation of SCM)

2. CHOLESKY (Cholesky factorization of SCM)

INFOMRS 2011 @ Charlotte

11

Computation time on single processor


SDPARA replaces these bottleneks by
parallel computation

Control

POP

ELEMENTS

22228

668

CHOLESKY

1593

1992

Total

23986

2713

Time unit is
second
, SDPA 7, Xeon 5460 (3.16GHz)

%
95

INFOMRS 2011 @ Charlotte

12

Dense & Sparse SCM

SDPARA can select Dense or Sparse automatically.

Fully dense SCM (100%)

Quantum Chemistry

Sparse SCM (9.26%)


POP

B
B


j
i
ij
A
Y
XA
B



1
INFOMRS 2011 @ Charlotte

13

Different Approaches


Dense

Sparse

ELEMENTS

Row
-
wise
distribution

Formula
-
cost
-
based
distribution

CHOLESKY

Parallel dense

Cholesky

(Scalapack)

Parallel sparse

Cholesky

(MUMPS)

INFOMRS 2011 @ Charlotte

14

Three formulas for ELEMENTS



j
i
ij
A
Y
XA
B



1
j
ij
i
j
i
A
U
B
Y
XA
U
dense
A
A
F




,
,
:
1
1










,
,
,
1
2
,
,
:




j
ij
i
j
i
A
XV
B
Y
A
V
sparse
A
dense
A
F
























,
,
,
1
,
,
,
3
,
:
j
i
ij
j
i
A
Y
A
X
B
sparse
A
A
F
B
dense

sparse

1
A
m
A
1
A
m
A
1
F
2
F
3
F
All rows are independent.

INFOMRS 2011 @ Charlotte

15

Row
-
wise distribution


Assign servers

in
a
cyclic

manner



Simple idea


Very
EFFICINENT


High scalability

Server1

Server2

Server3

Server2

Server3

Server4

Server1

Server4

INFOMRS 2011 @ Charlotte

16

Numerical Results on Dense SCM


Quantum Chemistry (m=7230, SCM=100%), middle size


SDPARA 7.3.1, Xeon X5460, 3.16GHz x2, 48GB memory

28678
7192
1826
548
131
47
29700
7764
2294
10
100
1000
10000
100000
1
4
16
Servers
Second
ELEMENTS
CHOLESKY
Total
ELEMENTS 15x speedup

Total 13x speedup

Very fast!!

INFOMRS 2011 @ Charlotte

17

Drawback of Row
-
wise


to
Sparse

SCM

























,
,
,
1
,
,
,
3
,
:
j
i
ij
j
i
A
Y
A
X
B
sparse
A
A
F
B
dense

sparse

1
A
m
A
1
A
m
A

Simple row
-
wise is
ineffective

for
sparse SCM


We estimate
cost

of each element


j
i
ij
A
A
B
#
#
2
)
(
cost



INFOMRS 2011 @ Charlotte

18

Formula
-
cost
-
based distribution


150

40

30

20

135

20

70

10

50

5

30

3

Server1

190

Server2

185

Server3

188

Good load
-
balance

INFOMRS 2011 @ Charlotte

19

Numerical Results on Sparse SCM


Control Theory (m=109,246,
SCM=4.39%
), middle size


SDPARA 7.3.1, Xeon X5460, 3.16GHz x2, 48GB memory

1137
296
85
4053
1386
950
5284
1744
1074
10
100
1000
10000
1
4
16
Servers
Second
ELEMENTS
CHOLESKY
Total
ELEMENTS 13x speedup

CHOLESKY 4.7xspeedup


Total 5x speedup

INFOMRS 2011 @ Charlotte

20

Comparison with PCSDP

by SDP with Dense SCM


developed by Ivanov & de Klerk

Servers

1

2

4

8

16

PCSDP

53768

27854

14273

7995

4050

SDPARA

5983

2002

1680

901

565

Time unit is second

SDP: B.2P Quantum Chemistry (m = 7230, SCM = 100%)

Xeon X5460, 3.16GHz x2, 48GB memory

SDPARA is
8x faster

by
MPI & Multi
-
Threading

INFOMRS 2011 @ Charlotte

21

Comparison with PCSDP

by SDP with Sparse SCM


SDPARA handles SCM as sparse


Only SDPARA can solve this size

#sensors 1,000 (m=16450; density=1.23%)

#Servers

1

2

4

8

16

PCSDP

O.M.

1527

887

591

368

SDPARA

28.2

22.1

16.7

13.8

27.3

#sensors 35,000 (m=527096; density=6.53
×

10−3%)

#Servers

1

2

4

8

16

PCSDP

Out of Memory

SDPARA

1080

845

614

540

506

INFOMRS 2011 @ Charlotte

22

Extremely Large
-
Scale SDPs


16 Servers [Xeon X5670(2.93GHz) , 128GB Memory]

m

SCM

time

Esc32_b(QAP)

198,432

100%

129,186


second

(1.5days)

Other solvers can handle only

000
,
40

m
The LARGEST solved SDP


in the world


INFOMRS 2011 @ Charlotte

23

Conclusion


Row
-
wise & Formula
-
cost
-
based
distribution


parallel Cholesky factorization


SDPARA:


The fastest solver


for large SDPs


http://sdpa.sf.net/ & Online solver


Thank you very much for your attention.