Two-Stage Stochastic Mixed-Integer Linear Programming ...

Two-Stage StochasticMixed-Integer Linear Programming:

From Scenarios to Conditional Scenarios

C. Beltran-Royo∗

December 7, 2017

Abstract

This paper is a second part of [2] where the conditional scenario (CS) conceptwas introduced and applied to approximate the multinormal distribution in the con-text of two-stage stochastic mixed-integer linear programming. Now we proposean improved definition of conditional scenario which is suitable to approximate anydistribution (not only the multinormal one). In particular, it allows to approximatea potentially large set of scenarios by a small set of conditional scenarios. In thecomputational study here presented, the CS approach has outperformed thescenario reduction approach based on probability distances, in terms of solu-tion time.

Keywords: Stochastic mixed-integer linear programming, scenario reduction, con-ditional scenario, capacitated facility location, portfolio selection.

1 Introduction

The conditional scenario (CS) approach was introduced in [2] as an effective midpoint between sce-nario based optimization and deterministic optimization. Now, in this short introduction we summarizethe new results about conditional scenarios reported in this paper. The state of the art of the two-stagestochastic mixed-integer linear programming (SMILP) regarding relevance, solution approaches andapproximation formulations, can be reviewed in [2] and the references therein.

The CS approach was applied in [2] to approximate the SMILP problem formulated in terms of amultinormal random vector which accounted for all the uncertain parameters of the problem. Al-though the definition of conditional scenario given in [2] was general, in practice, it results suitablebasically for the multinormal distribution. In this paper, we give an improved definition of conditionalscenario which is suitable for any distribution (continuous or discrete). Furthermore, for the multi-normal distribution, the new definition is equivalent to the previous one (further details are given inSection 3).

In Section 5, the CS approach will be compared to the well-established scenario reduction ap-proach based on minimizing some probability distance which, in this paper, we will call the SRDapproach (Scenario Reduction based on probability Distances). Roughly speaking, the SRD ap-proach determines a reduced number of scenarios that best retain the essential features of theoriginal scenario set according to a given probability distance [7, 9, 19]. We will try to answerthe following questions: a) How can one approximate a potentially large set of scenarios by a set ofconditional scenarios? b) How can one approximate any random vector (continuous or discrete) by

∗[email protected], Computer Science and Statistics, Rey Juan Carlos University, Madrid, Spain.

1

conditional scenarios? c) How does the SMILP problem, formulated in terms of a given number ofconditional scenarios, compare with the counterpart formulated in terms of the same number of sce-narios selected by the SRD approach? That is, do the two approaches obtain solutions of similarquality? Are the corresponding solution times similar?

To answer the previous questions we proceed as follows: In Section 2 we describe de notation andformulate the SMILP problem with recourse which we call the recourse problem. In Section 3 thenew definition of conditional scenario is introduced and we describe how to approximate a set ofscenarios by a set of conditional scenarios. The CS problem is formulated in section 4. In Section5 we compare the CS and SRD approaches from a computational point of view, by solving twowell known problems: the capacitated facility location problem and the portfolio selection problem.Finally, Section 7 concludes the paper.

2 Problem formulation

Notice that the following notation and recourse problem corresponds to the one given in [2], where amore complete description can be found.

2.1 Notation

Indexes:t Decision stages t ∈ T = {1, 2}j Integer components of decision vectors j ∈ Jt = {1, . . . , Jt}, t ∈ Te Realizations of random variables e ∈ E = {1, . . . , E}r Components of random vectors r ∈ R = {1, . . . , R}s Scenarios s ∈ S = {1, . . . , S}re Index pair for conditional scenarios re ∈ REr = R× Er

where Er = {1, . . . , Er}

Random vectors:ξ Random vector with finite support which accounts for all the

random parameters of the recourse problem

ξs Scenario or realization of the random vector ξ s ∈ Sps Probability of ξs, that is, ps = P

(ξ = ξs

)s ∈ S

ξr Component r of the random vector ξ r ∈ Rξre Conditional scenario re ∈ RErpre Probability of ξre re ∈ RErξr Random vector defined by the conditional scenarios

{ξre}re∈REr and the probability values {pre}re∈REr r ∈ Rξ Expectation of ξ, that is, ξ = E[ ξ ]

Notice that scenarios are indicated by the tilde symbol, e.g. ξs, conditional scenarios by the hat symbol,e.g. ξre, and expected values by the bar symbol, e.g. ξ.

2

2.2 The recourse problem

In this paper we consider the following two-stage stochastic mixed-integer linear programming prob-lem with recourse, which we call the recourse problem (RP):

minx

zRP = cᵀ1x1 +∑s∈S

ps csᵀ2 xs2 (1)

s.t. A1x1 = b1 (2)

As2x1 + Bs

2xs2 = bs2 s ∈ S (3)

x1 ≥ 0 (4)

xs2 ≥ 0 s ∈ S (5)

x1j integer j ∈ J1 (6)

xs2j integer s ∈ S, j ∈ J2, (7)

where Jt is the index set for the integer variables at stage t for all t ∈ T . In this context, scenarioξs represents the sth realization of the random parameters, that is, ξs = vec

(cs2, A

s2, B

s2, b

s2

), where

‘vec’ is the operator that stacks vectors and matrix columns into a single vector. Then, ξ is the randomvector defined by the set of scenarios {ξs}s∈S and the corresponding probability values are {ps}s∈S ,such that P

(ξ = ξs

)= ps, for all s ∈ S.

2.3 The expected value problem

The RP problem can be approximated by the so-called expected value problem. More precisely, theEV problem corresponds to approximate ξ by ξ = E[ ξ ] = vec

(c2, A2, B2, b2

)and can be stated as

minx

zEV = cᵀ1x1 + cᵀ2x2

s.t. A1x1 = b1

A2x1 + B2x2 = b2

x1 ≥ 0

x2 ≥ 0

xtj integer t ∈ T , j ∈ Jt.

3 Conditional scenarios

In this section we review and analyze the definition of conditional scenario introduced in [2] and pro-pose an improved definition. Furthermore, we show how the new definition can be used to approximatea large set of scenarios by a reduced set of conditional scenarios.

3.1 Conditional scenarios for a general distribution

As pointed out in [2], the SMILP problem formulated in terms of a continuous random vector, sayξ =

(ξ1 . . . ξR

)ᵀ, is in general numerically intractable. To address this difficulty one can approximate

ξ by a random vector, say ξ =(ξ1 . . . ξR

)ᵀ, with a discrete support consisting of S scenarios, and

solve the corresponding RP problem. Some technique to compute a representative set of scenariosis normally used: moment matching methods [11, 14], the Sample Average Approximation method[10, 15], approaches based on probability metrics [7, 20], among others. If the computational effort ofusing ξ is too high, one can approximate it by conditional scenarios.

Let us review the definition of conditional scenario given in [2] and point out its drawbacks.

3

Definition 1. (Conditional scenario [2]) Let us consider the discrete random vector ξ =(ξ1 . . . ξR

)ᵀwith finite support {ξs}s∈S and probability values {ps}s∈S . The conditional scenarios of ξ associatedwith the r-th component of ξ and the corresponding probability values are defined as follows:

ξre = E[ ξ | ξr = ξre ] e ∈ Er (8)

pre = P(ξr = ξre

)e ∈ Er,

where the right-hand side in (8) is the conditional expectation of ξ given ξr = ξre (see [6]).

In [2], this definition was successfully used within the procedure to approximate a multinormal randomvector by means of conditional scenarios. This is an important case since the multinormal distributionis one of the most relevant ones mainly because of the multivariate central limit theorem [12]. The pro-cedure introduced in [2] requires to compute the conditional expectation (8), which in the multinormalcase it is easily done by means of the corresponding analytical expression (see Proposition 3 below fordetails). Thus, if our objective is to approximate a continuous random vector by means of conditionalscenarios, the first drawback of the procedure based on Definition 1 is that it requires the analyticalexpression of the conditional expectation (8), which is not known or may be difficult to compute for ageneral distribution (the multinormal distribution is one exception).

The second drawback of Definition 1 appears when one wishes to approximate a set of scenarios, say{ξs}s∈S , by conditional scenarios. In this case, given a particular value of ξr, say ξre0 , it is likely thatonly one scenario, say ξs0 , attains this value in the component rth and therefore

ξre0 = E[ ξ | ξr = ξre0 ] = ξs0 .

In this case, it is likely that the set of conditional scenarios will correspond to the set of scenarios. Insummary, although Definition 1 is a general one, in practice, it is suitable basically for the multinormaldistribution.

Below we give a new definition of conditional scenario which overcomes the drawbacks of Definition1. Furthermore, the new definition is equivalent to the previous one for the multinormal distribution(Proposition 3). To this end we need the following assumption.

Assumption 1. In this paper, given a random vector ξ =(ξ1 . . . ξR

)ᵀ (continuous or discrete):

a) The right-open interval Ir = [ar, br) ⊂ R is any interval that contains the support of ξr

b) Ir is partitioned into finitely many subintervals Ire such that Ir = ∪e∈ErIre, whereIre = [are, bre) 6= ∅, for all re ∈ REr.

The new definition of conditional scenarios is as follows.

Definition 2. (Conditional scenario) Let us consider the random vector ξ (continuous or discrete).Given any r ∈ R, let us also consider an interval Ir and a partition of it {Ire}e∈Er which fulfillAssumption 1. Then, the conditional scenarios of ξ, corresponding to the above partition, are definedas follows:

ξre = Eξ

[ξ | ξr ∈ Ire

]e ∈ Er

pre = P(ξr ∈ Ire

)e ∈ Er,

where the right-hand side in the first equality is the conditional expectation of ξ given ξr ∈ Ire.

Assumption 2. In this paper the term ‘conditional scenario’ will refer to Definition 2.

In this context, we can define the CS random vector ξr =(ξr1 . . . ξrR

)ᵀ such that its support is{ξre}e∈Er and such that P

(ξr = ξre

)= pre for all e ∈ Er. As proposed in [2], one can approximate

a given random vector ξ by the CS random vectors ξr, for all r ∈ R, each one taken with probability

4

1/R, and then formulate the CS problem in terms of these random vectors (this problem will be definedin Section 4).

Let us see some properties regarding conditional expectation and conditional scenarios.

Proposition 1. (Tower property of conditional expectations) Given the random vector ξ and the inter-val Ire, one has that

Eξ

[ξ | ξr ∈ Ire

]= Eξr

[Eξ

[ξ | ξr

] ∣∣ ξr ∈ Ire]

re ∈ REr,

where, on the one hand, it is assumed that the expectation on the left-hand side of the equation is finiteand, on the other hand, Eξ

[ξ | ξr

]is the conditional expectation of ξ given ξr.

Proposition 2. If L(ξr) = Eξ

[ξ | ξr

]is a linear function, then

Eξr

[Eξ

[ξ | ξr

] ∣∣ ξr ∈ Ire]

= Eξ

[ξ∣∣ Eξr

[ξr | ξr ∈ Ire

] ]re ∈ REr.

In this cae, the conditional scenarios can be computed as follows:

ξre = Eξ

[ξ∣∣ ξre ] re ∈ REr,

where ξre = Eξr

[ξr | ξr ∈ Ire

].

Proof: The first statement is clear considering that the expectation is a linear operator, i.e., for anylinear function L(ξr) one has E[L(ξr) ] = L(E[ ξr ]). The second statement is also straightforward:

ξre = Eξ

[ξ | ξr ∈ Ire

]re ∈ REr

= Eξr

[Eξ

[ξ | ξr

] ∣∣ ξr ∈ Ire]

(by Proposition 1)

= Eξ

[ξ∣∣ Eξr

[ξr | ξr ∈ Ire

] ](by the first statement)

= Eξ

[ξ∣∣ ξre ].

Proposition 3. (Conditional scenarios of a multinormal random vector) Let us consider the multinor-mal random vector ξ ∼ NR(µ,Σ). Let us also consider, for each r ∈ R, an interval Ir and a partitionof it, say {Ire}e∈Er , that fulfill Assumption 1. Then, the conditional scenarios of ξ corresponding tothe above partitions can be computed as follows:

ξre = µ+ξre − µrσ2r

Σ∗r re ∈ REr,

where

ξre = µr + σrφ(αre)− φ(βre)

Φ(βre)− Φ(αre)re ∈ REr (9)

αre = (are − µr)/σr and βre = (bre − µr)/σr, (10)

scalars µr and σ2r are the mean and variance, respectively, of the random variable ξr, functions φ

and Φ are the density and distribution function, respectively, of the standard normal distribution andvector Σ∗r is the r-th column of the covariance matrix Σ.

The corresponding probability values can be computed as follows:

pre = Φ(βre)− Φ(αre) re ∈ REr.

5

Proof: In the case of a multinormal random vector, the conditional expectation Eξ

[ξ | ξr

]can be

computed as follows ([8], Section 5.1):

L(ξr) = Eξ

[ξ | ξr

]= µ+

ξr − µrσ2r

Σ∗r r ∈ R,

which is linear in ξr. Thus, by Proposition 2, one has

ξre = Eξ

[ξ | ξre

]= µ+

ξre − µrσ2r

Σ∗r,

where ξre = Eξr

[ξr | ξr ∈ Ire

]. This expression corresponds to the expectation of a truncated normal

distribution which can be computed by equations (9)–(10) (see [13], Section 10.1). Finally,

pre = P(ξr ∈ Ire

)= Φ(βre)− Φ(αre) re ∈ REr.

Notice that this way to compute the conditional scenarios for the multinormal case is equivalent toMethod 2 in [2]. Otherwise said, the definitions of conditional scenario given in this paper and in [2]are equivalent for the multinormal case.

Proposition 4. (Conditional scenarios of a finite support random vector) Let us consider a randomvector ξ with finite support {ξs}s∈S and probability mass function f(ξs) = ps for all s ∈ S. Let us alsoconsider, for each r ∈ R, an interval Ir and a partition of it, say {Ire}e∈Er , that fulfill Assumption1. Then, the conditional scenarios of ξ corresponding to the above partitions can be computed asfollows:

ξre =∑s∈Sre

ps∑s∈Sre p

sξs re ∈ REr,

where Sre = {s | ξsr ∈ Ire}. Furthermore, the corresponding probability values can be computed asfollows:

pre =∑s∈Sre

ps re ∈ REr.

Proof: First, the probability mass function of ξ given ξr ∈ Ire corresponds to

f(ξs | ξr ∈ Ire) =f(ξs)∑

s∈Sre f(ξs)=

ps∑s∈Sre p

ss ∈ Sre

and f(ξs | ξr ∈ Ire) = 0 for all s 6∈ Sre.

Second, by definition of conditional scenario

ξre = Eξ

[ξ | ξr ∈ Ire

]re ∈ REr

=∑s∈Sre

ξs f(ξs | ξr ∈ Ire)

=∑s∈Sre

ps∑s∈Sre p

sξs.

Finally,

pre = P(ξr ∈ Ire

)=∑s∈Sre

ps re ∈ REr.

6

Table 1: Each column corresponds to a scenario.

Scenario number 1 2 3 4 5 6 7 8 9 10

ξs1 0.0 3.2 1.3 1.5 1.9 1.6 2.5 2.7 0.6 4.0

ξs2 1.0 6.7 2.1 3.9 5.5 4.5 6.1 7.9 3.3 9.0

ξs3 4.0 13.9 7.4 8.3 9.6 10.5 11.1 12.5 5.8 16.0

Table 2: The scenarios have been classified according to the de-mand of the first product into two groups: low and high demand.

Low demand (group 1) High demand (group 2)

Scenario number 1 9 3 4 6 5 7 8 2 10

ξs1 0.0 0.6 1.3 1.5 1.6 1.9 2.5 2.7 3.2 4.0

ξs2 1.0 3.3 2.1 3.9 4.5 5.5 6.1 7.9 6.7 9.0

ξs3 4.0 5.8 7.4 8.3 10.5 9.6 11.1 12.5 13.9 16.0

3.2 From scenarios to conditional scenarios

In this section we show how Proposition 4 can be used to approximate a set of scenarios by a set ofconditional scenarios. As a first step in this direction, let us illustrate Proposition 4 by means of thefollowing example.

Example 1. (From scenarios to conditional scenarios: Calculations) A company has forecast the nextseason demand of three products in order to plan the corresponding production. This demand forecastcan be represented by means of ten equiprobable scenarios which can be found in Table 1. We usevector ξs to denote scenario s, where component ξsr represents the demand of the rth product accordingto scenario s for all r ∈ R = {1, 2, 3} and for all s ∈ S = {1, . . . , 10}. Thus, the uncertain demandcan be modeled by means of the random vector of demands ξ with support {ξs}s∈S and probabilityvalues {ps}s∈S , where ps = 1/10 for all s ∈ S.

The simplest approximation of ξ is the expected scenario, that is ξ =(

1.9 5.0 9.9)ᵀ, which cor-

responds to the average of the previous ten scenarios. A more informative approximations can beobtained by means of conditional scenarios. For example, one can compute a set of two conditionalscenarios considering low and high forecast demands of Product 1. We will use vectors ξ1,1 and ξ1,2 todenote the conditional scenarios associated to low and high forecast demands, respectively, of Product1. To compute the two conditional scenarios, first, we observe in Table 1 that the forecast demands ofProduct 1 are in the interval I1 = [0.0, 4.1). Second, we partition I1 into two subintervals:

I1,1 = [0.0, 2.0) (low demand)

I1,2 = [2.0, 4.1) (high demand),

and group the scenarios according to this partition of the Product 1 demand in Table 2. Notice that inTable 2 entries in the row corresponding to ξs1 are in italic font and sorted in ascending order, such thatin the first group ξs1 is in I1,1 and in the second group ξs1 is in I1,2. The number of scenarios is six andfour in the first and second groups, respectively.

The conditional scenario ξ1,1 corresponds to the conditional expectation of ξ given ξ1 ∈ [0, 2), that

7

is,

ξ1,1 = E[ ξ | ξ1 ∈ I1,1 ]

=∑

s∈S1,1

ps∑s∈S1,1 p

sξs =

∑s∈S1,1

1/10

6/10ξs (by Proposition 4)

=1

6

0.0

1.0

4.0

+1

6

0.6

3.3

5.8

+ · · ·+ 1

6

1.9

5.5

9.6

=

1.2

3.4

7.6

.

Analogously, the conditional scenario ξ1,2 can be computed as the average of the four scenarios in thesecond group:

ξ1,2 =1

4

2.5

6.1

11.1

+1

4

2.7

7.9

12.5

+ · · ·+ 1

4

4.0

9.0

16.0

=

3.1

7.4

13.4

.

From these conditional scenarios we can define the random vector ξ1 with support {ξ1,e}e∈E and thecorresponding probability values p1,1 = 6/10 and p1,2 = 4/10. Notice that arbitrarily we have onlycomputed two conditional scenarios but more conditional scenarios could have been computed byconsidering more intervals for Product 1 demand. Furthermore, to compute the conditional scenarioswe have focused on Product 1 demand. Analogously we could focus on the demands of Products 2and 3 and construct the corresponding conditional scenarios whose associated random vectors wouldbe ξ2 and ξ3, respectively.

In summary, we observe that with the set of ten scenarios ξs with probability 1/10 each, we have aprobabilistic model which can be used to forecast the future demand. This probabilistic model can beapproximated by the single expected scenario ξ =

(2.0 5.0 9.9

)ᵀ. But, it can also be approximatedby two conditional scenarios ξ1,1 =

(1.2 3.4 7.6

)ᵀ and ξ1,2 =(

3.1 7.4 13.4)ᵀ with probability

values 6/10 and 4/10, respectively. Conditional scenarios ξ1,1 and ξ1,2 represent the demands that canbe expected when demand for Product 1 is low and high, respectively.

Now we generalize the ideas seen in the previous example. As is well known, given a random vectorξ =

(ξ1 . . . ξR

)ᵀ, its expectation ξ = E[ ξ ] corresponds to the long-run average vector of repetitions

of the experiment ξ represents. In the case of having dependent components in ξ, it can be advanta-geous to use the possible knowledge of one component to compute the expectation of the whole randomvector. For example, if we observe that ξ1 = 100, the conditional expectation of ξ given ξ1 = 100,that is, E[ ξ | ξ1 = 100 ], will be a better picture of the long-run average vector of repetitions of theexperiment ξ represents, than E[ ξ ]. By Definition 2, the conditional scenario ξre corresponds to theconditional expectation E[ ξ | ξr ∈ Ire ] for each re ∈ REr. Based on these conditional scenarioswe can define the random vector ξr with support {ξre}e∈Er and the corresponding probability values{pre}e∈Er , for all r ∈ R.

The following method summarizes the steps performed in Example 1 and can be used to approximatea (large) set of scenarios into a (small) set of conditional scenarios.

Method 1. (From scenarios to conditional scenarios)

• Objective: To approximate a set of scenarios by a set of conditional scenarios.

• Input:

a) A set of scenarios {ξs}s∈S with probability ps = 1/S, for all s ∈ S = {1, . . . , S}.

8

b) Er, the number of conditional scenarios for coordinate r, for each r ∈ R = {1, . . . , R}.

• Output: For each r ∈ R, a set of conditional scenarios {ξre}e∈Er and the correspondingprobability values {pre}e∈Er with Er = {1, . . . , Er}.

• Steps: For each r ∈ R :

1) Define the interval Ir = [ar, br) such that

ar = mins∈S{ξsr} br = max

s∈S{ξsr}+ ε,

where ε is a small positive number, say 0.1. Notice that by adding this ε, the right-open interval Ir fulfils Assumption 1.a.

2) Partition the interval Ir into Er subintervals Ire = [are, bre) for all e ∈ Er, accord-ing to Assumption 1.b.

3) Classify the scenarios such that the index set Sre contains the indexes of the scenar-ios that fulfill the condition ξsr ∈ Ire for all e ∈ Er.

4) For all e ∈ Er :

i) Set Sre as the cardinality of Sre.ii) Compute the corresponding conditional scenario and its probability value:

ξre = E[ ξ | ξr ∈ Ire ] =1

Sre

∑s∈Sre

ξs

pre = Sre/S.

Notice that:

• The previous method considers equiprobable scenarios, but it could easily be adapted tothe non equiprobable case by using Proposition 4.

• The idea in the previous method is to approximate the scenarios in group Sre by theconditional scenario ξre such that its probability is proportional to the number of scenariosof the group. This generalizes the idea of expected scenario, where one considers onlyone scenario group which is approximated by ξ.

• The previous method is distribution free since it only requires a set of scenarios.

• This approach can also be used when the the distribution of ξ is known. The procedure isas follows: Sample a set of scenarios from ξ and proceed with Steps 1 to 4.

• This procedure can be used to effectively approximate a possibly large set of scenarios bya small set of conditional scenarios.

In the following example, we show the result of approximating a large set of scenarios by a small setof conditional scenarios.

Example 2. (From scenarios to conditional scenarios: Geometry) Let us consider the multinormalrandom vector ξ =

(ξ1 ξ2

)ᵀ ∼ N2(µ,Σ) such that

µ =(

100 200)ᵀ

Σ =

(400 480

480 1600

).

In order to illustrate the geometry of the conditional scenarios in the multinormal case, let us drawa random sample of S1 = 8, 000 scenarios from the previous multinormal population (see Figure

9

1). In Section 5 we will compare two ways to approximate a large set of scenarios. The first onecorresponds to select by the SRD method a subset of scenarios of small size, say S2, from theoriginal set of scenarios (see Section 5 for details). The second one corresponds to compute a set ofS2 conditional scenarios by using Method 1. In this example we set S2 = 12. In Figure 1 we have theoriginal set of 8, 000 scenarios (top plot) and the 12 SRD scenarios together with the 12 conditionalscenarios (bottom plot).

Figure 1: The initial set of 8, 000 scenarios (top plot) is approxi-mated by a set of 12 SRD scenarios labelled by ‘*’ (bottom plot)and by a set of 12 conditional scenarios labelled by ‘o’ (the linesare plotted to show their trajectories).

4 The conditional scenario problem

The CS problem can be defined as the approximation of the RP problem which is based on the approx-imation of the random vector ξ by the set of random vectors {ξr}r∈R, each one taken with probability

10

1/R. Otherwise said, the set of scenarios {ξs}s∈S is approximated by the set of conditional scenarios{ξre}re∈REr (see [2] for details):

minx

zCS = cᵀ1x1 +1

R

∑re∈REr

pre creᵀ2 xre2 (11)

s.t. A1x1 = b1 (12)

Are2 x1 + Bre

2 xre2 = bre2 re ∈ REr (13)

x1 ≥ 0 (14)

xre2 ≥ 0 re ∈ REr (15)

x1j integer j ∈ J1 (16)

xre2j integer re ∈ REr, j ∈ J2. (17)

In this context

ξre = E[ ξ | ξr ∈ Ire ] = vec(cre2 , A

re2 , B

re2 , b

re2

)re ∈ REr,

such that {Ire}e∈Er is a partition of Ir that fulfills Assumption 1 for each r ∈ R. Notice that the CSproblem has the same structure as the RP problem (1)–(7).

5 Computational study

Let us assume that the RP problem (1)-(7) has been formulated in terms of a large set of S1 scenarios,which, in turn, has been approximated by a small set of S2 scenarios by using the CS and SRDapproaches. As already said, the SRD approach corresponds to the scenario reduction methodbased on minimizing some probability distance. More specifically, given the probability distri-bution associated to the initial set of scenarios, the SRD approach determines a scenario subsetof prescribed cardinality and a probability measure based on this subset that is the closest to theinitial distribution in term of a probability distance [7]. There exist different versions of the SRDapproach as reported in [9]. In our case, considering that, as we will see, S2 < S1/4, the moreeffective version corresponds to the Fast Forward Selection (FFS) algorithm (see Algorithm 2.4,[9]). Furthermore, to fully determine the FFS algorithm, different cost functions c(ξ, ξ′) can beused. According to [7], in the case of two-stage stochastic linear programs, it is appropriate touse the following cost function:

c(ξ, ξ′) = ‖ξ − ξ′‖ ·max{

1, ‖ξ − ξ‖, ‖ξ′ − ξ‖}

ξ, ξ′ ∈ S2, (18)

where ‖ · ‖ is the Euclidean norm and ξ is the expectation of ξ. In summary, our implementationof the SRD approach corresponds to Algorithm 2.4 in [9] with the cost function defined by (18).

The objective of this section is to compare the SRD, EV and CS problems as approximations to theRP problem intended to reduce its computational burden. In this computational study, we will solvetwo well known problems: the capacitated facility location (CFL) problem with uncertain demands[3, 17, 21] and the portfolio selection problem [5, 16, 24]. Computations have been conducted ona PC using Windows 7 (64 bits), with a Intel Core i5 processor, 2.67GHz and 8 GB of RAM. Thecorresponding MILP problems have been solved by CPLEX 12.6 with default parameter values. Inthis context, a relevant CPLEX parameter is the so-called ‘relative MIP gap tolerance’ defined as

100× | best bound – best integer |(10−10+ | best integer |)

%,

which default value is 0.01%(see [1]).

11

Table 3: Indexes and deterministic parameters of the SRD problem (CFL case).

Parameter Value Unit Description

I 10 - Number of (candidate) facilities

I {1, . . . , I} - Index set for facilities

i - - Index for facilities

J 30 - Number of clients

J {1, . . . , J} - Index set for clients

j - - Index for clients

IJ I × J - Index set for pairs ij

fi 1000 + 300i euros Fixed cost of facility i

cij 0.3IJ − 0.01(I(j − 1) + i

)euros / unit Cost of supplying client j by facility i

qj 3∑

i∈I cij/I euros / unit Penalty for unmet demand

dj 100 + 10j units Expected demand of client j

K∑

j∈J dj/I units Ratio

‘Total demand / Number of facilities’

gi 4(

0.5K + (K/I)i)

units Capacity of facility i

5.1 The capacitated facility location problem

In the CFL problem we have a set of candidate facilities indexed by I = {1, . . . , I} and a set of clientsindexed by J = {1, . . . , J}. In the stochastic CFL problem here solved demands are random andrepresented by a set of equiprobable scenarios {ds}s∈S . The goal is to choose a subset of facilitiesin order to satisfy the demand of all the clients at the minimum expected cost, which accounts for thefixed costs fi and for the supplying costs cij (see Table 3). Since demands are random and facilitieshave a limited capacity gi, the facilities opened in the first stage may be insufficient to satisfy all ofthe demands in the second stage. In this case, a penalty term for unmet demand, proportional to qj , isintroduced in the objective function of the deterministic CFL formulation [23].

5.1.1 The SRD capacitated facility location problem

The SRD capacitated facility location problem, or for short, the SRD problem, corresponds to solve theCFL problem with recourse formulated in terms of a small set of scenarios selected by the SRD ap-proach. The deterministic parameters, random parameters and decision variables of the SRD problemcan be found in Tables 3, 4 and 5, respectively. To formulate the SRD problem we consider S2 = 240scenarios selected from the initial set of S1 = 8, 000 scenarios by means of the SRD approach. Weconsider 240 scenarios in order to match the number of conditional scenarios that we will consider inthe CS counterpart.

12

Table 4: Random parameters of the SRD problem (CFL case).


d(d1 . . . dJ

)ᵀ units Random demands represented by a random sample

of S1 = 8, 000 equiprobable scenarios {ds}s∈S1 drawn

from the multinormal distribution NJ(µ,Σ)

µj dj units Expected demand of client j

σj 0.2 dj units Standard deviation of dj

ρj1,j2 0.7 - Correlation between dj1 and dj2 , j1 6= j2

σj1,j2 ρj1,j2 σj1σj2 units2 Covariance between dj1 and dj2 , j1 6= j2

ds - units Realization of d (scenario)

R 30 - Number of random parameters

Er 8 - Number of conditional scenarios

corresponding to the rth component of d

S1 8, 000 - Initial number of scenarios

S1 {1, . . . , S1} - Index set for the initial set of scenarios

S2 240 - Number of scenarios used in the SRD problem

S2 {1, . . . , S2} - Index set for the scenarios used in the SRD problem

s - - Index for scenarios

Table 5: Decision variables of the SRD problem (CFL case).

Decision Unit Description

ui - ui = 1, if facility i is opened

ui = 0, otherwise

xsij % Fraction of the demand of client j supplied by facility i

ysj % Fraction of the demand of client j unmet

13

Then the SRD capacitated facility location problem can be written as follows:

minu,x,y

zSRD =∑i∈I

fi ui +∑s∈S2

ps( ∑

ij∈IJcij d

sj x

sij +

∑j∈J

qj dsj y

sj

)(19)

s.t.∑i∈I

xsij + ysj = 1 s ∈ S2, j ∈ J (20)∑j∈J

dsj xsij ≤ gi ui s ∈ S2, i ∈ I (21)

ui ∈ {1, 0} i ∈ I (22)

xsij ≥ 0, ysj ≥ 0 s ∈ S2, ij ∈ IJ . (23)

This problem corresponds to the RP problem (1)–(7) formulated in terms of 240 scenarios selected bythe SRD approach. Notice that we have not included the constraints xsij ≤ ui for all s ∈ S2, ij ∈ IJ .These constrains are normally used in facility location problems to strengthen the MILP formulation,however in a preliminary computational test we have observed that they considerably increase theproblem dimension resulting in longer solution times compared with the current formulation (19)–(23). After solving this SRD problem by using the MILP solver (CPLEX) we have obtained:

z∗SRD = 684, 707 euros

u∗1 =(

1 1 1 0 0 0 0 0 0 1)ᵀ.

5.1.2 The EV capacitated facility location problem

The simplest approximation of the RP problem corresponds to the EV problem, which approximatesthe random vector of demands by its expectation d = E[ d ] (see Section 2.3). After solving this EVproblem we have obtained:

z∗EV = 681, 716 euros

u∗1 =(

1 1 0 0 0 0 0 1 0 0)ᵀ.

5.1.3 The CS capacitated facility location problem

If the number of scenarios needed to formulate the RP problem becomes too large for practical pur-poses (the corresponding solution time is too long), one can approximate it by the CS problem. Toformulate the CS problem one can compute the conditional scenarios that approximate the initial setof S1 = 8, 000 scenarios by using Method 1 with R = 30 random parameters and Er = 8 condi-tional scenarios. Given that we set Er = E = 8 for all r ∈ R, the CS problem considers a total ofR · E = 240 conditional scenarios. We choose Er = 8 for two reasons. On the one hand, to keepthe computational burden of the corresponding MILP problem low. On the other hand, since we knowthat the scenarios, componentwise, come from the normal distribution which has probability almostone along the interval of length 8σ centered at µ, we consider that one discretization point per eachsubinterval of length σ, would be appropriate. Of course this is a heuristic approach and the impactof the Er value that one uses in Method 1 is a matter that deserves further research. Notice that theCS problem has the same structure as the SRD problem (19)-(23). The only difference is the way it isused to approximate the random vector d (conditional scenarios dre versus scenarios ds). Therefore,the resulting MILP problems have the same dimensions for the two approaches. After solving the CSproblem we have obtained:

z∗CS = 683, 816 euros

u∗1 =(

1 1 1 0 0 0 0 1 0 0)ᵀ.

14

Table 6: Comparing the SRD, EV and CS solutions (CFL case).

Predicted expected cost Achieved expected cost

EV CS SRD E-SRD E-CS E-EV

Cost (euros) 681,716 683,816 684,707 684,354 685,002 747,652

Solution time (s) 0.1 11 128 25 25 25

5.1.4 Comparing the CS, SRD and EV solutions

Let us call EV the optimal value obtained by solving the EV problem, that is EV = z∗EV . As pointedout in [4], the expected cost predicted by EV is, in general, different from the expected cost achievedby the EV solution. The achieved expected cost is known in the literature as the ‘Expected result ofusing the EV solution’ (E-EV) and can be computed by solving the E-EV problem, which is nothingbut the RP problem (1)–(7) with the additional constraint x1 = x∗1, that is, fixing the first stage decisionto the first stage EV solution.

This remark is also valid for the CS and SRD problems and, therefore, one can define and compute thevalues CS, E-CS, SRD and E-SRD in an analogous way. We have set the E-EV, E-CS and E-SRDproblems in terms of the initial set of scenarios S1. In this case one has ps = 1/S1 for all s ∈ S1.

In order to compare the SRD, EV and CS solutions, we have collected all the above mentioned valuesin Table 6. We observe that the results obtained by the CS and SRD approaches are similar. We alsoobserve that for the EV problem the achieved expected cost (E-EV=747,652 euros) is considerablyworse than the predicted one (EV=681,716 euros).

5.1.5 Comparing the CS and SRD performances

In the previous section we have observed that the SRD and CS approaches obtain solutions whichachieve a significantly lower expected cost compared to the EV approach (see Table 6). In this sectionwe compare the SRD and CS performances by solving the CFL problem for different number offacilities and clients. At this point, it is relevant to mention that we compare the CS problem to theSRD problem (with a reduced number of scenarios) as approximations to the RP problem. With theSRD method it would possible, at least in theory, to solve the RP problem to optimality by taking anumber of scenarios large enough. In contrast, the CS method has been designed to approximate theRP problem in order to obtain good, possibly suboptimal, solutions.

Table 7 reports the number of: facilities I, clients J = 3I , scenarios and rows/columns of the constraintmatrix of the solved MILP instances. The other parameters are summarized in Tables 3 and 4.The CS and SRD performances are compared in Figures 2, 3 and 4 regarding the ‘achieved expectedcost’, the ‘solution time’ and the ‘LP gap’, respectively. The solution time accounts for the time tocompute the scenarios plus the time to solve the corresponding MILP problem. The figures usedto plot these graphs can be found in the Appendix (Tables 17, 18 and 19, respectively). A possiblereason for the faster performance of the CS approach is its smaller LP gap (the relative gap betweenthe optimal CS cost and the optimal cost of the LP relaxation of the CS problem).

15

Table 7: Size of the MILP instances. The CS and the SRD prob-lems have the same sizes (CFL case).

Instance I J Scenarios Rows Columns

1 16 48 384 24,576 313,360

2 17 51 408 27,744 374,561

3 18 54 432 31,104 443,250

4 19 57 456 34,656 519,859

5 20 60 480 38,400 604,820

6 21 63 504 42,336 698,565

7 22 66 528 46,464 801,526

8 23 69 552 50,784 914,135

9 24 72 576 55,296 1,036,824

10 25 75 600 60,000 1,170,025

Figure 2: The achieved expected cost obtained by the CS andSRD approaches are similar (see Appendix/Table 17).

5.2 The portfolio selection problem

In this section we consider the well known risk-return portfolio selection problem where one decideshow much to invest in each asset taking into account its random returns rj . The model here solvedcorresponds to the mean semi-absolute deviation (MSAD) model with hard real-world constraints [5],namely, a limit on the number of assets to be included in the portfolio and lower and upper limits ontheir weights. The MSAD corresponds to the mean of the deviations below the portfolio expected rateof return and can be computed as follows:

MSAD(x) = E[ [rᵀx− µᵀx]− ], (24)

where x =(x1 . . . xJ

)ᵀ is the portfolio selection weights, rᵀx is the (uncertain) portfolio return, µᵀxis the portfolio expected rate of return and [α]− = −min{0, α} for any number α. It can be shownthat the MSAD model is equivalent to the mean absolute deviation (MAD) model but with half the

16

Figure 3: On average, the CS approach has been 1.8 times fasterthan the SRD approach (see Appendix/Table 18).

Figure 4: On average, the CS gap has been 1.5 times smaller thanthe SRD gap (see Appendix/Table 19).

17

Table 8: Indexes and deterministic parameters of the SRD problem (portfolio case).


j - Index for assets (stocks and commodities)

J 50 assets Number of assets

J {1, . . . , J} - Index set for assets

k - - Index for stocks

K 25 stocks Number of stocks

K {1, . . . ,K} - Index set for stocks

l - - Index for commodities

L 25 commodities Number of commodities

L {1, . . . , L} - Index set for commodities

rsk 0.05 + 0.10k/K % Expected return rate of stock k

rcl 0.10 + 0.10l/L % Expected return rate of commodity l

r vec(rs, rc

)- Vector of expected return rates

b( ∑

j∈J rj)/J % Required expected return for the portfolio

U dJ/4e = 13 assets Maximum number of assets in the portfolio

lb 0.5/U = 3.85 % Buy-in lower bound

ub 1.5/U = 11.54 % Buy-in upper bound

number of variables [16, 22]. According to [5], when hard real-world constraints are incorporated intoportfolio selection models, most authors prefer to use MILP models as, for example, the MSAD modelinstead of quadratic ones as, for example, the mean-variance model [18], to improve the computationalsolvability.

The objective is to obtain an expected return of b% while attaining the lowest possible MSAD, byinvesting in 13 assets, at most, from an initial set of 50 candidate assets: 25 stocks and 25 commodities(corn, milk, oil, copper, gold, etc.). The investor considers these two types of assets since commoditiesare known to move in the opposite direction of the stock market, on average. That is, commoditiesare, in general, negatively correlated with stocks. In this way the investor can invest in both assets todiversify and hedge his portfolio (stabilize his portfolio from volatility).

5.2.1 The SRD portfolio selection problem

The SRD portfolio selection problem, or for short, the SRD problem, corresponds to solve the port-folio selection problem with recourse formulated in terms of a small set of scenarios selected by theSRD approach. The deterministic parameters, the random parameters and the decision variables aredescribed in Tables 8, 9 and 10, respectively. To formulate the SRD problem we consider S2 = 400scenarios selected from the initial set of S1 = 8, 000 scenarios by means of the SRD approach. Weconsider 400 scenarios in order to match the number of conditional scenarios that we will consider inthe CS counterpart.

The SRD portfolio selection problem can be written as follows:

18

Table 9: Random parameters of the SRD problem (portfolio case).


r(r1 . . . rJ

)ᵀ % Random return rates represented by a random sample

of S1 = 8, 000 equiprobable scenarios {rs}s∈S1 drawn

from the multinormal distribution NJ(µ,Σ)

µj rj % Expected return rate of asset j

σj rj % Standard deviation of rjρsk1,k2 0.6 - Correlation between return rates of stocks k1

and k2, k1 6= k2

ρcl1,l2 0.6 - Correlation between return rates of commodities l1and l2, l1 6= l2

ρsck1,l2 −0.4 - Correlation between return rates of stock k1

and commodity l2

ρ

(ρs ρsc

ρsc ᵀ ρc

)Correlation matrix

σj1,j2 ρj1,j2 σj1σj2 (%)2 Covariance between rj1 and rj2 , j1 6= j2

rs - - Realization of r (scenario)

R 50 - Number of random parameters

Ej 8 - Number of conditional scenarios

corresponding to the jth component of r

S1 8, 000 - Initial number of scenarios

S1 {1, . . . , S1} - Index set for the initial set of scenarios

S2 400 - Number of scenarios used in the SRD problem

S2 {1, . . . , S2} - Index set for the scenarios used in the SRD problem

s - - Index for scenarios

Table 10: Decision variables of the SRD problem (portfolio case).

Decision variable Unit Description

uj - uj = 1, if one invests in the jth asset

ui = 0, otherwise

xj % Proportion of the funds to be placed in

the jth asset

ys % Negative part of the deviation from the expected return

ys = [rsᵀx− rᵀx]−

19

minu,x,y

zSRD =∑s∈S2

psys (25)

s.t. lb uj ≤ xj j ∈ J (26)

xj ≤ ub uj j ∈ J (27)∑j∈J

uj ≤ U (28)

∑j∈J

xj = 1 (29)

∑j∈J

rj xj = b (30)

∑j∈J

rsj xj + ys ≥ b s ∈ S2 (31)

uj ∈ {0, 1} j ∈ J (32)

x, ys ≥ 0 s ∈ S2. (33)

This problem corresponds to the RP problem (1)–(7) formulated in terms of 400 scenarios selectedby the SRD approach. Notice that the MSAD defined in equation (24) is computed by means ofequations (25), (30), (31) and (33). After solving this SRD problem by using the MILP solver we haveobtained z∗SRD = 1.72% and

x∗ᵀ1 =(

0 0 11.54 0 0 0 0 11.51 0 0

0 0 0 0 10.11 0 0 8.10 10.02 0

0 0 7.64 3.85 0 0 0 4.92 0 0

0 0 0 3.85 0 3.85 0 0 11.15 0

0 0 0 0 0 0 0 3.85 0 9.63)

(in %).

We have broken vector x∗ᵀ1 into rows given its long length. The first row corresponds to the first tenassets.

5.2.2 The CS portfolio selection problem

As in Section 5.1.3, to formulate the CS problem we calculate the conditional scenarios that approx-imate the initial set of S1 = 8, 000 scenarios by using Method 1 with Er = 8 conditional scenariosper random parameter and R = 50 random parameters (R · Er = 400 conditional scenarios). Aftersolving the CS problem we have obtained z∗CS = 0.46% and

x∗ᵀ1 =(

0 11.54 3.85 0 0 5.89 0 0 0 0

0 0 0 0 11.54 0 0 0 11.54 0

10.36 0 0 0 11.54 0 0 0 0 0

0 0 0 0 0 0 11.54 0 11.54 0

0 0 0 0 0 0 10.67 0 0 0)

(in %).

5.2.3 Comparing the CS and SRD solutions

We call CS the predicted MSAD obtained by solving the CS problem, that is CS = z∗CS and we call E-CS the achieved MSAD obtained by solving the E-CS problem. Analogously for SRD and E-SRD. To

20

set the E-CS and E-SRD problems we have used the initial set of scenarios S1, which has cardinalityS1 = 8, 000. In order to compare the CS and SRD solutions, we have collected all these values inTable 11. We observe that the achieved MSAD is similar for the CS and SRD approaches (the SRDone, is slightly better).

Table 11: Comparing the CS and the SRD solutions (portfolio case).

Predicted MSAD Achieved MSAD

CS SRD E-SRD E-CS

MSAD (%) 0.46 1.72 1.86 1.89

Solution time (s) 0.4 181 0.006 0.006

5.2.4 Comparing the CS and SRD performances

In this section we compare the SRD and CS performances by solving the portfolio selection problemfor different number of assets. Table 12 reports the number of: assets J (50% stocks and 50% com-modities), scenarios and rows/columns of the constraint matrix of the corresponding MILP instances.The other parameters are summarized in Tables 8 and 9. The CS and SRD performances are com-pared in Figures 5, 6 and 7 regarding the ‘achieved MSAD’, the ‘solution time’ and the ‘LP gap’,respectively. The solution time accounts for the time to compute the scenarios plus the time tosolve the corresponding MILP problem. Notice that, in these figures, we do not report the resultscorresponding to instances 6 and 7 since they could not be solved by the SRD approach withina CPLEX time limit of 1,000 seconds. The figures used to plot these graphs can be found in theAppendix (Tables 20, 21 and 22, respectively). As in the CFL case, a possible reason for the fasterperformance of the CS approach is its smaller LP gap.

Table 12: Size of the MILP instances. The CS and the SRD prob-lems have the same sizes (portfolio case).

Instance J Scenarios Rows Columns

1 20 160 203 200

2 40 320 403 400

3 60 480 603 600

4 80 640 803 800

5 100 800 1,003 1,000

6 120 960 1,203 1,200

7 140 1,120 1,403 1,400

8 160 1,280 1,603 1,600

9 180 1,440 1,803 1,800

10 200 1,600 2,003 2,000

21

Figure 5: The achieved MSAD values obtained by the CS andSRD approaches are similar. The SRD problem could not besolved for instances 6 and 7 within the CPLEX time limit (seeAppendix/Table 20).

Figure 6: The SRD problem could not be solved for instances 6and 7 within the CPLEX time limit. For the other eight instancesthe CS approach was, on average, 192 times faster than the SRDapproach (see Appendix/Table 21).

22

Figure 7: The number of instances with zero LP gap has beensix and zero for the CS and SRD approaches, respectively (seeAppendix/Table 22).

6 Validation of results

The results reported in Sections 5.1 and 5.2 were obtained by considering an initial randomsample of S1 = 8, 000 scenarios for each instance. Therefore, the results there reported for theCS and SRD approaches were also random. In order to validate these results, we replicate andsolve ten times each instance described in Tables 7 and 12, by drawing a different random sampleof S2 = 8, 000 scenarios each time. In Tables 13, 14, 15 and 16 we report the resulting averageand standard deviation for the achieved expected cost/MSAD and the solution time.

Notice that in Table 16, instances 5, 6, 7, 8 and 10 could not be solved by the SRD approachwithin a CPLEX time limit of 1,000 seconds, for one or more replications of the correspond-ing instance. Thus, in these cases, the average solution time is calculated by using at least onetruncated CPLEX time (1,000 seconds), which is indicated by the symbol >. For example, the5th instance of Table 16 has an average SRD solution time that is > 637, which means that thetrue value, the one that would be computed without a time limit, is greater than 637 seconds. Inthis case, since some replications of the 5th instance have not been solved, we do not report thecorresponding average results in Table 15.

All in all, we observe that the average results in this section are similar to the ones reported inSection 5 but with the additional variability information: The achieved expected cost/MSAD issimilar for the compared approaches and the CS approach performs faster and much faster thanthe SRD approach to solve the CFL and portfolio instances, respectively.

23

Table 13: CFL problem - Achieved expected cost (euros) forthe CS and SRD approaches (average results after solving tentimes each instance).

Instance E-CS E-SRD

Average Std. deviation Average Std. deviation

1 3,759,027 7,916 3,756,276 7,435

2 4,698,720 9,301 4,698,278 9,226

3 5,809,570 11,529 5,806,122 11,079

4 7,099,305 13,863 7,098,615 13,735

5 8,599,737 14,512 8,593,899 15,928

6 10,314,397 19,732 10,313,480 19,390

7 12,284,623 22,122 12,277,073 22,518

8 14,510,064 27,310 14,508,872 26,835

9 17,038,549 30,046 17,029,315 31,549

10 19,867,732 37,508 19,866,028 36,686

Average 10,398,172 19,384 10,394,796 19,438

Table 14: CFL problem - Solution time (seconds) for the CSand SRD approaches (average results).

Instance CS solution time SRD solution time


1 130 55 277 104

2 25 2 243 19

3 166 107 228 10

4 38 3 319 55

5 234 112 263 4

6 66 11 438 156

7 301 142 320 10

8 98 8 430 55

9 286 128 370 20

10 129 12 473 83

Average 147 58 336 52

24

Table 15: Portfolio problem - Achieved MSAD (%) for the CSand SRD approaches (average results).

Instance E-CS E-SRD


1 2.29 0.06 2.23 0.03

2 1.96 0.03 1.91 0.04

3 1.85 0.06 1.81 0.04

4 1.79 0.04 1.76 0.02

5 1.71 0.03 - -

6 1.70 0.04 - -

7 1.67 0.05 - -

8 1.66 0.04 - -

9 1.64 0.04 1.63 0.02

10 1.62 0.39 - -

Average 1.79 0.08 - -

Table 16: Portfolio problem - Solution time (seconds) for theCS and SRD approaches (average results).



1 0.1 0.02 71 1

2 0.2 0.02 141 1

3 0.5 0.02 230 21

4 0.9 0.04 336 67

5 1.4 0.05 > 637 -

6 2.0 0.14 > 770 -

7 2.8 0.09 > 977 -

8 3.8 0.14 > 884 -

9 4.8 0.12 735 64

10 6.1 0.14 > 1301 -

Average 2.26 0.08 > 608 -

7 Conclusions

The conditional scenario (CS) problem was introduced in [2] as an effective approximation to the two-stage stochastic mixed-integer linear programming problem. Although the definition of conditionalscenario given in [2] can be used to approximate a general random vector by means of conditional sce-narios, in practice, it results suitable basically for the multinormal distribution. The main contributionof this paper has been to give an improved definition of conditional scenario which is suitable for anydistribution (continuous or discrete). Furthermore, for the multinormal distribution, the new definitionis equivalent to the previous one. On top of that, the new definition of conditional scenario allowsto approximate a possibly large set of scenarios by a small set of conditional scenarios (see Method

25

1). In this way, any multivariate distribution can be approximated. It is enough to generate a largerandom sample of scenarios chosen from the distribution and approximate them by a set of conditionalscenarios.

In the computational study, we have solved the capacitated facility location problem and the portfo-lio selection problem to compare the performances of the CS approach and the Scenario Reductionbased on probability Distances (SRD) approach (with the same number of scenarios). In this com-parison, we have observed that the solution quality is similar (slightly better for the SRD approach)and that, in general, the CS approach is faster. The CS instances have shown a smaller LP gap thanthe corresponding SRD instances, which could explain the shorter CS solution time. This was alreadyobserved in [2]. Finally, we are aware that, there remain research questions regarding the CS approach,as for example, its performance in risk averse optimization and in stochastic non-linear optimization,among others.

Acknowledgments: We are grateful to the editor and to the reviewer for their constructive and stim-ulating suggestions. We are also grateful to the Spanish Ministry of Economy, Industry and Competi-tiveness under grants MTM2012-36163-C06-06 and MTM2015-63710-P.

8 Appendix: Tables with the numerical results

8.1 CFL problem

Tables 17, 18 and 19 report the figures used to plot Figures 2, 3 and 4, respectively.

Table 17: CFL problem - Achieved expected cost (euros) for theCS and SRD approaches.

Instance E-CS E-SRD

1 3,762,337 3,759,957

2 4,703,134 4,702,820

3 5,813,172 5,811,833

4 7,104,953 7,104,586

5 8,609,502 8,600,769

6 10,322,686 10,322,220

7 12,300,056 12,287,021

8 14,521,840 14,521,385

9 17,054,072 17,044,150

10 19,884,614 19,884,100

Average 10,407,637 10,403,884

26

Table 18: CFL problem - Solution time (seconds) for the CS andSRD approaches. CS1 = ‘Time to compute the CS scenarios’,CS2 = ‘Time to solve the CS problem (CPLEX time)’, Total =CS1 + CS2. Analogously for the SRD times.


CS1 CS2 Total SRD1 SRD2 Total

1 0.2 120 120 171 156 327

2 0.3 23 23 183 50 233

3 0.3 283 283 192 29 221

4 0.3 40 40 203 83 286

5 0.4 215 215 213 49 262

6 0.4 92 92 223 113 336

7 0.5 413 414 231 77 308

8 0.5 102 103 246 189 435

9 0.5 415 416 258 133 391

10 0.5 135 136 268 220 488

Average 0.4 184 184 219 110 329

Table 19: CFL problem - Comparing the LP gaps.

Instance CS problem SRD problem

LP bound z∗CS LP gap LP bound z∗SRD LP gap

(euros) (euros) (%) (euros) (euros) (%)

1 3,758,237 3,758,546 0.0082 3,760,297 3,760,777 0.0128

2 4,700,971 4,701,128 0.0033 4,702,623 4,703,084 0.0098

3 5,809,880 5,810,303 0.0073 5,809,506 5,809,906 0.0069

4 7,102,161 7,102,364 0.0029 7,102,985 7,103,660 0.0095

5 8,598,576 8,599,336 0.0088 8,601,985 8,602,254 0.0031

6 10,319,069 10,319,343 0.0027 10,314,919 10,315,940 0.0099

7 12,284,614 12,285,697 0.0088 12,284,563 12,284,671 0.0009

8 14,517,558 14,517,883 0.0022 14,514,558 14,515,878 0.0091

9 17,041,442 17,041,827 0.0023 17,035,702 17,035,740 0.0002

10 19,879,284 19,879,663 0.0019 19,882,494 19,884,415 0.0097

Average 10,401,179 10,401,609 0.0048 10,400,963 10,401,633 0.0072

27

8.2 Portfolio selection problem

Tables 20, 21 and 22 report the figures used to plot Figures 5, 6 and 7, respectively. Instances 6 and 7could not be solved by the SRD approach within a CPLEX time limit of 1,000 seconds.

Table 20: Portfolio problem - Achieved MSAD (%) for the CSand SRD approaches.

Instance E-CS E-SRD

1 2.22 2.26

2 1.95 1.90

3 1.83 1.79

4 1.79 1.76

5 1.71 1.69

6 1.66 -

7 1.65 -

8 1.67 1.64

9 1.64 1.63

10 1.68 1.65

Average 1.78 -

Table 21: Portfolio problem -Solution time (seconds). CS1 =‘Time to compute the CS scenarios’, CS2 = ‘Time to solve theCS problem (CPLEX time)’ , Total = CS1 + CS2. Analogouslyfor the SRD times.

Instance CS solution time SRD olution time

CS1 CS2 Total SRD1 SRD2 Total

1 0.1 0.1 0.2 73 0.2 73

2 0.2 0.3 0.5 143 0.5 144

3 0.4 0.2 0.6 214 6.5 221

4 0.6 0.3 0.9 283 73.2 356

5 0.9 0.6 1.5 354 1.8 356

6 1.3 0.9 2.2 425 > 1,000 > 1,425

7 1.7 1.5 3.2 491 > 1,000 > 1,491

8 2.1 2.0 4.1 572 15.7 588

9 3.0 2.7 5.7 639 47.0 686

10 3.3 3.2 6.5 702 715.9 1,418

Average 1.4 1.2 2.5 390 > 286.1 > 676

28

Table 22: Portfolio problem - Comparing the LP gaps.

Instance CS problem SRD problem

LP bound z∗CS LP gap LP bound z∗SRD LP gap

(%) (%) (%) (%) (%) (%)

1 0.5008 0.5020 0.24 1.6038 1.6780 4.63

2 0.4572 0.4572 0.00 1.4856 1.5084 1.53

3 0.4611 0.4614 0.07 1.6833 1.7000 0.99

4 0.4528 0.4529 0.02 1.6894 1.7002 0.64

5 0.4284 0.4284 0.00 1.6972 1.6992 0.12

6 0.4121 0.4121 0.00 1.5903 - -

7 0.4325 0.4326 0.02 1.6797 - -

8 0.4340 0.4340 0.00 1.7308 1.7320 0.07

9 0.4466 0.4466 0.00 1.8056 1.8068 0.07

10 0.4513 0.4513 0.00 1.7033 1.7044 0.06

Average 0.4477 0.4479 0.03 1.6669 - -

References

[1] IBM ILOG CPLEX Optimization Studio CPLEX Parameters Reference Version 12 Release 6,2015.

[2] C. Beltran-Royo. Two-stage stochastic mixed-integer linear programming: The conditional sce-nario approach. Omega - The International Journal of Management Science, 70:31–42, 2017.

[3] M. Bieniek. A note on the facility location problem with stochastic demands. Omega, TheInternational Journal of Management Science, 55:53–60, 2015.

[4] J. R. Birge and F. V. Louveaux. Introduction to Stochastic Programming. Springer (New York),Second Edition, 2011.

[5] F. Cesarone, A. Scozzari, and F. Tardella. Linear vs. quadratic portfolio selection models withhard real-world constraints. Computational Management Science, 12(3):345–370, 2015.

[6] M. H. DeGroot and M. J. Schervish. Probability and Statistics. Addison Wesley (Boston, USA),Fourth Edition, 2012.

[7] J. Dupacova, N. Growe-Kuska, and W. Romisch. Scenario reduction in stochastic programming:An approach using probability metrics. Mathematical Programming, 95(3):493–511, 2003.

[8] W. Hardle and L. Simar. Applied Multivariate Statistical Analysis. Springer, Berlin, Third Edi-tion, 2012.

[9] H. Heitsch and W. Romisch. Scenario reduction algorithms in stochastic programming. Compu-tational Optimization and Applications, 24(2-3):187–206, 2003.

[10] T. Homem-de-Mello and G. Bayraksan. Monte carlo sampling-based methods for stochasticoptimization. Surveys in Operations Research and Management Science, 19(1):56–85, 2014.

29

[11] K. Hoyland, M. Kaut, and S. W. Wallace. A heuristic for moment-matching scenario generation.Computational Optimization and Applications, 24(2-3):169–185, 2003.

[12] J. Jacod and P. Protter. Probability Essentials. Springer, 2004.

[13] N. L. Johnson, S. Kotz, and N. Balakrishnan. Continuous Univariate Distributions, Vol. 1. NewYork: John Wiley and Sons, Second Edition, 1994.

[14] A. J. King and S. W. Wallace. Modeling with Stochastic Programming. Springer Science andBusiness Media, New York, 2012.

[15] A. J. Kleywegt, A. Shapiro, and T. Homem-de Mello. The sample average approximation methodfor stochastic discrete optimization. SIAM Journal on Optimization, 12(2):479–502, 2002.

[16] H. Konno. Stochastic Programming: The state of the Art in Honor of George B. Dantzig, chap.Mean-Absolute Deviation Model, pages 239–255. Springer, 2011.

[17] F. V. Louveaux and D. Peeters. A dual-based procedure for stochastic facility location. Opera-tions Research, 40(3):564–573, 1992.

[18] H. Markowitz. Portfolio selection. The Journal of Finance, 7(1):77–91, 1952.

[19] J. M. Morales, S. Pineda, A. J. Conejo, and M. Carrion. Scenario reduction for futures markettrading in electricity markets. IEEE Transactions on Power Systems, 24(2):878–888, 2009.

[20] G. Ch. Pflug and A. Pichler. Multistage Stochastic Optimization. Springer, 2014.

[21] L. V. Snyder. Facility location under uncertainty: a review. IIE Transactions, 38(7):547–564,2006.

[22] M. G. Speranza. A heuristic algorithm for a portfolio optimization model applied to the Milanstock market. Computers and Operations Research, 23(5):433–441, 1996.

[23] Ramaswami Sridharan. The capacitated plant location problem. European Journal of Opera-tional Research, 87(2):203–213, 1995.

[24] M. Woodside-Oriakhi, C. Lucas, and J. E. Beasley. Portfolio rebalancing with an investmenthorizon and transaction costs. Omega, 41(2):406–420, 2013.

30

Two-Stage Stochastic Mixed-Integer Linear Programming ...

Documents

Transcript of Two-Stage Stochastic Mixed-Integer Linear Programming ...