Scalable Multi-Stage Stochastic Programming

Scalable Multi-Stage Stochastic Programming

Cosmin Petra and Mihai Anitescu

Mathematics and Computer Science DivisionArgonne National Laboratory

DOE Applied Mathematics Program MeetingApril 3-5, 2010

Motivation

Sources of uncertainty in complex energy systems– Weather– Consumer Demand– Market prices

Applications @Argonne – Anitescu, Constantinescu, Zavala– Stochastic Unit Commitment with Wind Power Generation– Energy management of Co-generation– Economic Optimization of a Building Energy System

2

Stochastic Unit Commitment with Wind Power

Wind Forecast – WRF(Weather Research and Forecasting) Model– Real-time grid-nested 24h simulation – 30 samples require 1h on 500 CPUs (Jazz@Argonne)

3

1min COST

s.t. , ,

, ,

ramping constr., min. up/down constr.

wind

wind

p u dsjk jk jk

s j ks

sjk kj

windsjk

j

wik ksj

ndsk

jjk

j

c c cN

p D s k

p D R s k

p

p

S N T

N

N

N

N

S T

S T

Slide courtesy of V. Zavala & E. Constantinescu

Economic Optimization of a Building Energy System

4

1

0

2

2( ,0)

1min COST = ( ) ( ) ( ) ( ) ( )

s.t. ( ) ( ) ( ) ( ) ( ,0) , (0)

, ( ) ( ,0)

t TSelec elec gas

elec c elec h gas hj t

jgas elec elec j j jI

I h h c I W I I

j j jj jW W WI W

C C C dS

TC S T T T T

T T TT T k

x

( , )

0

0, ( , ) 0

(0, ) ( ), [ , ]( ) , {1

(

, ,

)

}

jA

jj W

WL

jW Wmin j maxI I I

TT L k

T x T x t t TT T T j S

T

Proactive management - temperature forecasting & electricity prices Minimize daily energy costs

1 hr 3 hr 6 hr 9 hr 12 hr 16 hr 24 hr0

20

40

60

80

100

Horizon

Rel

ativ

e C

ost [

%]

Slide courtesy of V. Zavala

Optimization under Uncertainty Two-stage stochastic programming with recourse (“here-and-now”)

5

0

0 0( )) ( ,xx

Min f x M xin f E_subj. to. 0 0 0

0

0

) )( ( ((

)0, ) 0

A bA B b

xx x

x

x

0 1 20, 1, , ,

1( )) (S

S

x iixx ix

Min f x xfS

0 0 0

0

0 1, ...,

,0, 0,

k k k k

k Sk

A bA B bx x

xx x

subj. to.1 2, , , S

( ) : ( ( ), ( ), ( ), ( ), ( ))A B b Q c

continuous discrete

Sampling

Inference Analysis

M samples

Sample average approximation (SAA)

Multi-stage SAA SP Problems – Scenario formulation

6

0

1 4

762 53

7 7

0 0

12

T Ti i i i i

i i

QMin x cx x

s.t. 0 0 0

1 0 1 1 1

2 1 2 2 2

3 1 3 3 3

4 4 4 4

5 4 5 5 5

6 4 6 6 6

7 7 7 7

0 2 4

0

4

1 3 5 6 70 0 0 0 0 0 0 0

xx x

x xx x

x xx xx x

A bA B b

A B bA B b

A B bA B bA B bA B b

x x x x x x xx x

x

Depth-first traversal of the scenario tree Nested half-arrow shaped Jacobian Block separable obj. func.

Linear Algebra of Primal-Dual Interior-Point Methods

7

12

T Tx Qx c x

subj. to.

Min

0Ax bx

Convex quadratic problem

0

T xQ Arhs

yA

IPM Linear System

1 1

1 1

2 2

2 2

01 2 0

0

0 00 0

0 00 0

0 00 0

0 0 00 0 0 0 0 0 0

T

T

TS S

S ST T T T

S

H BB A

H BB A

H BB A

A A A H AA

Two-stage SP

arrow-shaped linear system(via a permutation)

Multi-stage SP

nested

The Direct Schur Complement Method (DSC) Uses the arrow shape of H

Solving Hz=r

8

20

1 11 1 1 10

2 22 2 2

0

10 20 01 2 0

T T T

T T

T TS NS S S S

S c cS

T

Tc

T

L DH G L LL DH G L L

L DH G L LL L L L DG G G H L

10

1

10

, , 1, ,

, .

T T

T Ti i i c

i i i i i i i i

S

ci

c

L D L H L G L D i S

H H G D L CC G L

1

10 0

10

, 1, ,i i i

c i i

S

i

Lw r i S

w L r wL

1 , 0,...,i i iv iD Sw

10 0

0 0 ,

1, .

c

T Ti i i i

L v

L v

z

z

i S

L z

Back substitution

Implicit factorization

Diagonal solve Forward substitution

High performance computing with DSC

9

Gondzio (OOPS) 6-stages 1 billion variables Zavala et.al., 2007 (in IPOPT) Our experiments (PIPS) – strong scaling is investigated

Building energy system• Almost linear scaling

Unit commitment • Relaxation solved• Largest instance

• 28.9 millions variables• 1000 cores

12 x

on Fusion @ Argonne

Scalability of DSC

10

Unit commitment 76.7% efficiency

but not always the case

Large number of 1st stage variables: 38.6% efficiency

on Fusion @ Argonne

Preconditioned Schur Complement (PSC)

11

1

0 01

i i i

Nii

i

N

w r

r L

L

r w

1 , 0,...,i i iv iD Nw

0 0T T

i i i iz L v L z

10

1

Ti

N

ii

iH H GC G

1

0 1, , ,,T Ti i i i i i i iL L iD L H G L D N

(separate process)

TM M ML D L M

00 Krylov( , , )C Mz r

The Stochastic Preconditioner The exact structure of C is

IID subset of n scenarios:

The stochastic preconditioner (Petra & Anitescu, 2010)

For C use the constraint preconditioner (Keller et. al., 2000)

12

1

11

00

0

1.

0

T T Ti i i i

S

iiQ A B Q B A A

CA

S

SS

1 2{ , , , }nk k k K

1

1

1

01 .

i i i ii

n

k k k kki

T TnS Q A B B AQ

n

0

0

.0

TnS A

MA

The “Ugly” Unit Commitment Problem

13

DSC on P processes vs PSC on P+1 processOptimal use of PSC – linear scaling

Factorization of the preconditioner can not behidden anymore.

• 120 scenarios

Quality of the Stochastic Preconditioner

“Exponentially” better preconditioning (Petra & Anitescu 2010)

Proof: Hoeffding inequality

Assumptions on the problem’s random data1. Boundedness2. Uniform full rank of and

14

21

24 24Pr(| ( exp) 1| ) 2

||2 ||n SS max

np L S

S pS

)(A )(B

1

11

01

i i i ii

T Tn k

n

k k k ki

S Q A B BQ An

1

01

11 S

S i i i iT T

iiS Q A B BQ A

S

not restrictive

Quality of the Constraint Preconditioner

has an eigenvalue 1 with order of multiplicity .

The rest of the eigenvalues satisfy

Proof: based on Bergamaschi et. al., 2004.

15

0

0 0

TnS A

MA

0

0 0

TSS A

CA

p r

1M C 2r

m1

ax1 1( ) ( ) ( )0 .min n S n SS S M C S S

The Krylov Methods Used for

BiCGStab using constraint preconditioner M

Preconditioned Projected CG (PPCG) (Gould et. al., 2001)– Preconditioned projection onto the

– Does not compute the basis for Instead,

–

16

1

0 0 0 0T T

nP S ZZ Z Z

0 .KerA

1 10 0 0 0 0 0( )T

Ny xA rA A S

0

0

Pr .00

Tn g rS A

guA

is computed from

100 0

200 00

TS xS A r

yA r

0 0Cz =r

0Z 0 .KerA

Performance of the preconditioner Eigenvalues clustering & Krylov iterations

Affected by the well-known ill-conditioning of IPMs.

17

1

01

0

( ) , where and S

( ( )) ( ) ) ) )( ( ( )(

n N N

T T

S

S Q A D

S

Q A

S

BD B

E

A Parallel Interior-Point Solver for Stochastic Programming (PIPS)

Convex QP SAA SP problems

Input: users specify the scenario tree

Object-oriented design based on OOQP

Linear algebra: tree vectors, tree matrices, tree linear systems

Scenario based parallelism – tree nodes (scenarios) are distributed across processors– inter-process communication based on MPI– dynamic load balancing

Mehrotra predictor-corrector IPM

18

Tree Linear Algebra – Data, Operations & Linear Systems

Data– Tree vector: b, c, x, etc– Tree symmetric matrix: Q– Tree general matrix: A

Operations

Linear systems: for each non-leaf node a two-stage problem is solved via Schur complement methods as previously described.

19

12

T Tx Qx c x

subj. to.

Min

0Ax bx

0

1 4

762 53

*( ) ( )

( )

,( )( )

,

(

,

( )

)

n n n

n n

n a n a n n n

T T Tn n n c c

c C

n

n

u nx n

Ax A x B x n

y B y

u v vQ

A A y

x Q

TT

T

* {0}(6) 4 (ancestor map)(1) {2,3} (children map)

aC

T T

T

Parallelization – Tree Distribution

The tree is distributed across processes. Example: 3 processes

Dynamic load balancing of the tree– Number partitioning problem --> graph partitioning --> METIS

20

CPU 1 CPU 2 CPU 3

0

1

2 3 7

0

4

65

0

4

Conclusions

The DSC method offers a good parallelism in an IPM framework.

The PSC method improves the scalability.

PIPS – solver for SP problems.

PIPS is ready for larger problems: 100,000 cores.

21

Future work New math / stat

– Asynchronous optimization– SAA error estimate

New scalable methods for a more efficient software– Better interconnect between (iterative) linear algebra and sampling

• importance-based preconditioning• multigrid decomposition

– Target: emerging exa architectures

PIPS– IPM hot-start, parallelization of the nodes– Ensure compatibility with other paradigms: NLP, conic progr., MILP/MINLP solvers– A ton of other small enhancements

Ensure computing needs for important applications – Unit commitment with transmission constraints & market integration (Zavala)

22

Thank you for your attention!

Questions?

23

Scalable Multi-Stage Stochastic Programming

Documents

Transcript of Scalable Multi-Stage Stochastic Programming