ProxSARAH: An E cient Algorithmic Framework for Stochastic ...
STOCHASTIC PROGRAMMING: Algorithmic Challenges Plenary ...
Transcript of STOCHASTIC PROGRAMMING: Algorithmic Challenges Plenary ...
ING:
es
heConference
of Arizona
1
STOCHASTIC PROGRAMM
Algorithmic Challeng
Plenary Lecture at tThe Nineth International SP
(August 28, 2001)
Suvrajeet SenSIE Department,University
Tucson, AZ
N CRIME
with several-time col-
AZ)
2
COLLABORATORS: PARTNERSI
Today’s talk is based on work individuals, especially my longleague Julia Higle (AZ). Also:Michael Casey (AZ)Guglielmo Lulli (Italy and AZ)Lewis Ntaimo (AZ)Brenda Rayco (Belgium and Yijia Xu (AZ)Lihua Yu (AZ)
from Con-
orithmsn
llenges in
3
This Presentation: Transitionstinuous to Discrete
1. Lessons from successful alg• Convexity and Decompositio• Special structure• Sampling• Inexact “solves”2. “Informal” exploration of chamulti-stage problems
ria and esti-
odelsnto Sto-
lyhedral
4
• Scenario trees, stopping critemates of solution quality
• Real-time Algorithms• Multi-granularity multi-stage m3. “Less Informal” exploration ichastic IP
• Literature• Two Stage SIP: Stochastic PoCombinatorics
• Multi-stage SIP4. Conclusions
Algorithmsms)
ion:
hapedRegularized,
methods pro-position-
5
1. Lessons from Successful (for Continuous Proble
1.1 Convexity and Decomposit
• Benders’ Decomposition (L-sMethod), and its extensions toStochastic and Interior Point vide resource directive decomcoordination approaches.
ann, Goffin,, Wets and
ns provides
position pro- (Augmented
6
Work of Birge, Dantzig, GassmHigle, Ruszczynski, Sen, Vialothers.
Convexity of the value functiothe justification
• Scenario Aggregation/Decomvides a certain price directiveLagrangian-type) approach.
nski, Wets
gain provides
tic linear pro-
lue functionutations.
7
Work of Rockafellar, Ruszczyand others.
Duality and hence convexity athe basis
1.2 Special structure: Stochasgramming
Polyhedral structure of the vaof LPs help streamline comp
ms withy scenarios),ite. This isposition (see
show thatimal solution with finite
8
It is well known that for problefinite support (i.e. finitely manBenders’ decomposition is finalso true for regularized decomwork of Kiwiel, Ruszczynski)
Homem de Mello and Shapirosampling also leads to an optin finitely many steps (for SLPsupport).
e Stochasticsses LPr fixed
f scenarios
]
9
Work with Higle shows how thDecomposition method by-pa“solves” by a matrix update forecourse problems
1.3 Sampling: Large number o
Minx X∈
f x( ) := E h x ω,( )[
te, algorith-, wherek
areinistic selec-
cenarios
x( )
10
• Since is difficult to evaluamic schemes replace byis an iteration counter.
• For deterministic algorithmsobtained by the same determ
tion of scenarios:
For stochastic algorithms are obtained by sampling s
f x( )f x( ) f k
f k
ωt{ }t 1=N
f k
: Work ofasiev etc.
ptimizationmple Averagetive Optimi-
ample mean another
h larger sam-d so on.cept:
11
• Stochastic-Quasi GradientsErmoliev, Gaivoronski, Ury
• Successive Sample Mean O(Stochastic Counterpart/SaApproximation, “Retrospeczation” in Simulation).
• The approach: create one sfunction, optimize it; createsample mean function (witple size), optimize that, an
This is really a “meta”-con
tion is a SPerated in one
.approximatesby one “cut”“cut” progres-ple meaneasing the
reduce vari-pdates.
12
- each sample mean optimiza- does not use information geniteration for subsequent ones• Stochastic Decomposition
the sample mean function in each iteration and each sively approximates a samfunction resulting from incrsample size.
• Common random numbersance, and allow recursive u
blems: By use a sto-ithm (a laSSD, will be Rayco’srvationiques canaster dramti-
13
• Sampling in Multi-stage Prosolving adual SLP, one canchastic cutting plane algorSD). This algorithm, calleddiscussed in detail inBrendapresentation. A brief obsethough ...aggregation technreduce the growth of the mcally.
onlinear Pro-mmingion methodlementationsature.D provides
14
1.4 Inexact “solves”
• Not as common in SP as in Ngramming and Integer Progra
• In SP, the Scenario Aggregatallows inexact solves, but imphave typically not used this fe
• The “argmax” procedure in S“inexact solves”
decomposi-ers Decom-
is students)roblems
rtant for SIPms are IP.
15
• A recent version of Benders’ tion, known as Abridged Bendposition (Work of Birge and hallows inexact solves in subp
This feature is extremely impoalgorithms since the subproble
allenges
Observation
ωt
16
2. “Informal” Exploration of Chfor Multi-stage SP
ωt 1–
xt 1–
Data
xtxt
Decision
for 1,...,t-1
.
on problem is
ω, t 1– )ωt 1– )]ωt 1–, )
]
17
• For t = 2,...,T, define functions
Assuming , the decisi
f t xt 1– ωt 1–,( ) Min ct xt xt 1–,(E f t 1+ xt ωt,([
s.t. xt Xt xt 1–(∈+
=
f T 1+ 0=
Min c1 x1( ) E f 2 x1 ω1,( )[x1 X1∈
+
iscrete approxi-satisfy some stochasticli, Dempster,
Wallace)timizationrest” scenario provides the
18
2.1 Scenario Trees• Current approaches seek a dmation (of a given size) whichproperties associated with theprocess. (The work of ConsigDupacova, Hoyland, Mulvey,
• Pflug develops a nonlinear opproblem which seeks the “neatree of a specified size whichbest approximation.
uence of treesich provideees?Frauen-rovides a
ility metricsort) appearsrowe-Kuska,
19
• How could one develop a seq(of the stochastic process) whsolutions with certain guarantdorfer’s Barycentric method ppartial answer.
• Approximations using probab(for problems with finite suppto be promising (Dupacova, GRomisch)
e original SPation. What is stage solu-
20
Suppose one approximates thusing some discrete approximthe quality of the resulting firsttion?... Ouput Analysis
lems.”ime problemsnts of compu-
e “simple-ajectory plan-ntinuous ran-d at futureide.
21
2.2 Real-time Algorithms• “Nested simple recourse prob- Recourse decisions in real-tmust be made within constraitational time.- Models consist of multi-stagrecourse decisions.” Such “trning models” may warrant codom variables (e.g. wind speelocations) on the right-hand s
l problemsbile automa-
but onlyf others.ted with thee path plan-age real-time
imilar appli-wegt et al.
22
• Real-time decision and contro- Example: a collection of motons know their own location,know approximate locations o- Location information is updapassage of time. Collision-frening problems lead to multi-stscheduling problems.- In AZ with Ntaimo and Xu. Scations by W. Powell, A. Kley
e Models
g
t.
23
2.3 Multi-granularity Multi-stag
Schedulin Maint.
Dispatch
Utility
Markets
Gen.
Dis
s operations
nt. In oureed to inin month
n-aids thate quite well.isions may bement) plan-
24
• Decisions of one group affectof another.
• Modeling time-lags is importaexample, power contracts agrmontht, will affect production t+s.
• Each group may have decisiocapture a particular time-scalFor instance, dispatching decdaily, generation (unit commit
ek, powery ahead” to
ch decisions?
power com-
25
ning may happen week-by-wecontracts may range from “da“six-months” ahead.
How should we coordinate su
(Work with Lulli, Yu, and an AZpany)
into SIP:roblems
es for continu-lgorithms forsen
roblems
26
3. “Less Informal” ExplorationThe Transition to Discrete P
• Our view is based on successous problems ... successful aSIP problems will ultimately u- Convexity and Decompositio- Special structure- Inexact “solves”- Sampling for Large Scale P
urseeveld,
ell solved)
oblems Vlerk
S)
27
3.1 Literature
• Two stage simple integer recoSeries of papers by Klein-HanStougie and van der Vlerk (w
• Two stage 0-1 ProblemsLaporte and Louveaux
• Two stage General Integer PrSchultz, Stougie and van derHemmeke and Schultz (SPEP
roblems
is (SPEPS)
Sen, Higle,
28
• Cutting planes for two stage pCaroe (dissertation)Caroe and TindAhmed, Tawarmalani, SahinidSen and Higle (SPEPS)Sherali and Fraticelli
• Multi-stage ProblemsBirge and Dempster (see alsoand Birge)Lokketangen and WoodruffCaroe and Schultz
hastic Polyhe-
binatorics
ree in B&B
tic versions (of
29
3.1 Two Stage Problems: Stocdral Combinatorics
What role does polyhedral complay in deterministic IP?- Reduces size of the search t
One should expect the stochascuts) to play the same role
SIP
all s
30
Consider the following 2-stage
Min cT
x Σs psgsT
ys
s.t Ax b
Tsx W ys ωs for
x Z+
n1ys Z+
n2∈,∈
≥+
≥
+
P;
ch non-inte-
cuts
ith Determin-
31
• Caroe’s approach:a) Solve SLP relaxation of SIb) If solution is integer, stop;c) Else, develop a “cut” for eager pair .d) Update the SIP by adding - Repeat from a)
• Observations:- Note the close connection wistic IP
x ys,( )
rmittedc.)us L-intained and
lved using L-
ct” cuts for
32
- Various different cuts are pe(Gomory, “Lift-and-Project” et
• Each cut involves only . Thshaped structure of SIP is mathe SLP relaxation can be soshaped method.
• Caroe suggests “lift-and-projebinary problems.
x ys,( )
stages.P relaxationch secondecessary. If
essary, stop.approxima-
repeat.
33
• Disjunctive Decomposition (D2)Algorithm (with Higle):
• Decompose the problem into twoa) Given a first stagex, solve an Lof the second, andstrengthen eastage convexification whenever nno further strengthening is necb) Convexify the value function tion of each second stage IP.c) Update the master program;
e(Linearllows all sce-
oefficients(C3
urse LP
P method forministic sce-P decompo-
34
• Observations:Special structur inequalities, fixed recourse) a
narios to share common cut-cTheorem).
• Cut generation is simple reco
• Does not reduce to a known Iproblems with only one deternario, that is, this is also anew Isition method.
c MILP
bles are 0-1 extreme
l variables”one with
35
Convergence for 0-1 StochastiAssumptions• Complete recourse• All second stage integer varia• First stage feasibility requirespoints ofX as in 0-1 problems
• Maintain all cuts in• If there are multiple “fractionato form a disjunction, choose
Wk
matrix fromhich the sameation
methodm.
d-Cut ands are currentlyd Sherali)
36
smallest index, and recall thethe most recent iteration at wvariable was used for cut form
Under these assumptions, theresults in a convergent algorith
Extensions to allow Branch-ancontinuous first stage decisionunderway. (Work with Higle an
D2
ecial structure
cial structure,d Long;
ailable for
37
3.2 Multi-stage SIP
Even more important to use spfor realistic problemsFor examples of the use of spesee papers by Takriti, Birge anNowak and Romisch.
Very few general algorithms avthis class of problems to date.
Branch-and-ds are calcu-tion Kiwiel’s
for 2-stageopment is.
...
38
• Caroe and Schultz propose aBound method in which bounlated using Lagrangian relaxa- Dual iterates generated withNDO algorithm- Computations are reported problems, although the develvalid for multi-stage problems
• Several important advantages
roblems han-
ge of special
to stochastic
39
- Two stage and multi-stage pdled with equal ease- It is possible to take advantastructure- Transition from deterministicmodel is easy
ork with
tages associ-iontage SIPl structure
ware (mature)dle
ee code
40
• Branch and Price for MSIP (WLulli)
Motivation• Has many of the same advanated with Lagrangian Relaxat- Handles 2-stage and Multi-s- Allows exploitation of specia
• Makes greater use of LP soft- Warm starts are easy to han- Sensitivity analysis is routin- Greater availability of reliabl
er a two-
all s
41
For notational simplicity considstage problem
Min cT
x Σs psgsT
ys
s.t Ax b
Tsx Wsys ωs for
x Z+
n1ys Z+
n2∈,∈
≥+
≥
+
pure integere. Solvingproblems addlexity ... only
idea is to haveanticipativity,rministic multi-
42
We have chosen a two stage, problem only for notational easmulti-stage, and mixed integerno additional conceptual compgreater computational work.
The general Branch-and-Pricea master IP that enforces non-and the subproblems are detestage problems.
enforce inte-
blems gener-here is anger point.Let
emes, eachted with am
r
43
Both master and sub-problemsger restrictions
For each scenario “s”, subproate integer points , windex associated with an inte
• As in “column generation” schof these points will be associa“column” in the master progra
xs r, ys r,,( )
f s r, cT
xs r, gsT
ys r,+=
m will consist
nal)s
hinghing
44
• The rows in the master prograof- First stage constraints (optio- Non-anticipativity constraint- Convexity constraint- Bounds on x’s used in branc- Bounds on y’s used in branc
B node is:
y branches
45
The master problem at any B&
Maxx α,
ΣsΣr ps
f s r, αs r,
s.t Ax b
x ΣsΣr psxs r, αs r,–
≤
0
Σr αs r, 1 for all s
l j x j u j , j among x branches,
Ls i, Σr ys r i, , αs r, Us i, , i among≤≤
≤ ≤
,
=
=
, solve thegeneration.
or nodeq
en this vari-ing., the value
46
The Basic Scheme• For any nodeq of the B&B treenodal problem using column
• If is the value of variablexj fand, this value is fractional, thable is a candidate for branch
• Similarly, if for some scenarios
xjq
Σr ys r i, , αs r,q
this to gener-tree.
47
is fractional, then we may useate two new nodes of a B&B
Master Program B&B Tree
Branching on x’s
Branching on y’s
rios has the
ns the specialted with a sce-
s
48
• The pricing problem for scenaform
• Note that this problem maintaistructure that may be associa
Min csT
x gsT
ys
s.t Tsx Wsys ω
x Z+
n1ys Z+
n2∈,∈
≥+
+
interested inot Sizingm is a
zing Problem
n be solved inre the same as
49
nario problem. Thus, if we’resolving Stochastic Dynamic LProblems, each pricing probleDynamic Deterministic Lot Si
• Also, each pricing problem caparallel. (These advantages ain Lagrangian Relaxation)
e SIP (Work
re applied to aension ofn such prob-tween produc-holding costs.
robabilistich sizing model
50
3.3 Computations for Multi-stagwith G. Lulli)Branch-and-Price concepts webatch sizing problem ... an extdynamic lot sizing problems. Ilems, one studies trade-offs betion/setup costs with inventory
Assuming no backlogging, or pconstraints, the stochastic batcis written as follows
t sizingon quantities
t I ts
ts
}ve
51
“Pretty much” the same as lomodel,except ....that productiare in increments ofb
Min Σs ps Σt ctxts f tyts hs.t. I ts
+ +I t 1 s,– bxts d
xts Mtytsxts I ts,( ) 0 t
xts integer, yts 0 1,{xts yts Non-anticipati,
∈∀≥
≤–+=
blems
PLEX
odes
7226962658561
52
Illustrative Computations
These are 5 stage pro
Table 1:
Prob B&Ptime
B&Pnodes
CPLEX
TimeC
N
16a 1.13 0 1.80 116b 1.15 0 0.71 516c 11.6 8 1.8 116d 16.3 11 6.3 516e 13.3 8 0.9 7
blemsenarios also
PLEX
odes
106
106
106
106
106
53
These are 6 stage pro7 stage problems with 64 sc
Prob B&Ptime
B&Pnodes
CPLEX
TimeC
N
32a 156 0 >T >32b 2945 0 >T >32c 91 8 1110 >32d 1064 11 >T >32e 403 8 2800 >
emain criti-
es, warm
ll emerge as
54
Conclusions
I should reiterate that• Convexity and Decomposition rcal
• Special structure, inexact solvstartsetc. remain critical.
• Sampling is new to SIP, but wiwe solve larger problems
continue ...
eneration
ould find eas-alidation
55
Important Trends which should
• Algorithmic approach to tree gand output analysis
• Computer implementations shier interfaces with simulation/vsoftware
ne word that
56
For SP, algorithms if there is odeserves its own slide it is ...
57
Scalability
Scalability
Scalability
Scalability
Scalability
g Problems
tion....
58
And finally,
Two Stage and Multi-stageStochastic Integer ProgramminRemain One of theGrand Challenges in Optimiza
st Welcome!
59
Thank you for your interest.
Comments and Questions, Mo
mmunity ...
60
In appreciation of the SP co
k on
roblems
nces” with-
ish musiciansposes; the
61
Top 5 reasons to wor
Stochastic Programming P
5. Can work with “cosmic distaout leaving home!
4. One begins to easily distingufrom mathematicians: one comother “decomposes”
ity” has noth-er or cavi-
yance”igh places!
” makes youh must go
62
3. One learns that “Log-concaving in common with either lumbties!
2. One also learns that “clairvorequires connections in very h
1. The word “non-anticipativityappreciate what President Busthrough!