Bilevel Integer Linear Programming - Lehigh University

44
Bilevel Integer Linear Programming TED RALPHS SCOTT DENEGRE ISE Department COR@L Lab Lehigh University [email protected] MOPTA 2009, Lehigh University, 19 August 2009 Thanks: Work supported in part by the National Science Foundation Surgeon General’s Warning This talk contains half-baked ideas. Consumption may result in periods of confusion and/or insomnia. Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 1 / 44

Transcript of Bilevel Integer Linear Programming - Lehigh University

Mixed Integer Bilevel Linear Programming - A Branch-and-Cut AlgorithmThanks:Work supported in part by the National Science Foundation
Surgeon General’s Warning This talk contains half-baked ideas. Consumption may result in periods of confusion and/or insomnia.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 1 / 44
Outline
2 Continuous BLPs
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 2 / 44
Motivation
A standard mathematical program models a decision to be madeby asingle decision-maker with asingle objective at asingle point in time.
Many real-world decision problems involve multiple, independent decision-makers(DMs), multiple, possibly conflicting objectives, and/or hierarchical/multi-stage decisions.
Modeling frameworks Multiobjective Programming⇐ multiple objectives, single DM Mathematical Programming with Recourse⇐ multiple stages, single DM Multilevel Programming⇐ multiple stages, multiple DMs
Multilevel programming generalizes standard mathematical programming by modeling hierarchical decision problems, such as Stackelberg games.
In this talk, we focus on bilevel programming models, particularly with discrete variables.
Such models arises in a remarkably wide array of applications.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 3 / 44
Applications
Parties in direct conflict Zero sum games Interdiction problems
Modeling “robustness”: leader represents external phenomena that cannot be controlled.
Weather External market conditions
Controlling optimized systems: follower represents a system that is optimized by its nature.
Electrical networks Biological systems
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 4 / 44
Basic Framework
For the remainder of the talk, we consider systems in which there are two DMs, a leader or upper-level DM and afollower or lower-level DM.
We assumeindividual rationality of the two DMs.
This means roughly that the leader has the ability to predictthe reaction of the follower to a given course of action.
For simplicity, we also assume that for every action by the leader, the follower has a feasible reaction.
The follower may in fact have more than one equally favorablereaction to a given action by the leader.
These alternatives may not be equally favorable to the leader.
We assume that the leader may choose among the follower’s alternatives.
This assumption is reasonable if the players have a “semi-cooperative” relationship.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 5 / 44
Bilevel Linear Programming
x ∈ X ⊆ R n1 are theupper-level variables
y ∈ Y ⊆ R n2 are thelower-level variables
Bilevel Linear Program
max {
c1x + d1y | x ∈ PU ∩ X, y ∈ argmin{d2y | y ∈ PL(x) ∩ Y} }
Theupper- andlower-level feasible regions are:
PU = {
.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 6 / 44
Notation
Notation
= {
}
MI(x) = M(x) ∩ Y
F I = {
}
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 7 / 44
Special Cases
WhenX = R n1 andY = R
n2, we have acontinuous BLP (usually just a BLP). WhenX = Z
p1 ∩ R n1−p1 and/orY = Z
p2 ∩ R n2−p2, then we have amixed integer
BLP. When does a solution exist?
Existence of Solutions (Dempe, 2001)
In the continuous case, if is nonempty and bounded, then there is a solution.
This suffices also in the case thatX = Z n1.
If X ⊃ Z n1 andY ⊃ R
n1, the problem may not have a solution in general because the feasible set may not be closed.
Further Generalizations
The follower’s variables may appear in the leader’s constraints (see, e.g., Audet et al. (1997)).
The follower’s objective may also parameterized (see Dempe(2001)).
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 8 / 44
Example
The following instance of (MIBLP) is from Moore and Bard (1990).
max x∈X
x + 2y ≤ 10
2x − y ≤ 15
2x + 10y ≥ 15
3
1
2
5
x
y
F
F
1 For X = R+ andY = R+, the feasible set isF and the solution is(8, 1) with objective value 18.
2 For X = Z+ andY = Z+, the feasible set isF I and the solution is(2, 2) with objective value 22.
3 For X = R+ andY = Z+, the feasible set isF and there is no solution. The infimum of the objective values is 22.5.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 9 / 44
An Important Special Case
If d1 = −d2, we can view this as amathematical program with recourse.
Two-stage stochastic programs with recourse (discrete distribution) are a special case.
The resulting problem can be solved as a standard mathematical program.
For the case whenY = R n, we can solve the problem by Benders Decomposition.
Note that the value function of the lower-level problem is convex in the upper-level variables, so we can also reformulate as a convex program
This is a useful way of visualizing the situation.
This is really the only tractable case.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 10 / 44
Technical Assumptions
We make the following assumptions in order to ensure the problem is well-posed and has a solution.
Assumptions
1 For every action by the leader, the follower has a rational reaction (PL(x) ∩ Y 6= ∅ for all x ∈ PU ∩ X).
2 The follower is semi-cooperative (the leader may choose among alternative members ofMI(x)).
3 The feasible setF I is nonempty and compact.
The BLP can now be simply stated as:
Bilevel Linear Program
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 11 / 44
Solving Continuous BLPs
In the continuous case, the lower-level problem can be replaced with its optimality conditions.
The optimality conditions for the lower-level optimization problem are
G2y ≥ b2 − A2x
(d2 − uG2)y = 0
u, y ∈ R+
Note that this is a special case of a class of non-linear mathematical programs known asmathematical programs with equilibrium constraints (MPECs).
This can be solved in a number of ways, including converting it to standard integer program.
Note that in this case, the value function of the lower-levelproblem is piecewise linear, but not necessarily convex.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 12 / 44
Discrete BLPs
When some of the variables are discrete, the situation is a bit more difficult.
Because the duals that exist for general integer programs are not tractable in general, we cannot use the same approach as we did for the continuous case.
In fact, going from the continuous case to the discrete case in the bilevel setting poses significantly different challenges than for standardMILPs.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 13 / 44
Duality for Mixed Integer Linear Programs
Let Γm = {F : R m ⇒ R | F is subadditive and nonincreasing, F(0) = 0}. Then the
subadditive dual is
F(aj) ≤ cj j ∈ [p2 + 1, n2]
F ∈ Γm2
F(d) = lim sup δ→0+
F(δd)
δ .
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 14 / 44
Reformulation with Optimality Conditions
In principle, we can use subadditive duality to obtain optimality conditions for the lower-level problem (reformulation shown here is for the pure integer case).
max x,y,F
c1x + d1y
n2 ∑
n2 + , F ∈ Γm2.
This is analogous to the reformulation in the continuous case, but is intractable in general.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 15 / 44
Towards a Branch and Bound Algorithm
The seemingly obvious thing to do then is to develop a branch-and-bound algorithm.
Our approach is to try as much as possible to directly generalize concepts from mixed integer linear programming.
Components
Search strategies
Preprocessing methods
Primal heuristics
In the remainder of the talk, we address development of thesecomponents.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 16 / 44
Contrast with Mixed Integer Linear Programming
In algorithms for solving standard mixed integer linear programs (MILPs), we frequently use the following properties.
Properties
1 If the continuous relaxation has no feasible solution, thenneither does the original problem.
2 If the continuous relaxation has a solution, then its objective value is a valid upper bound on that of the original problem.
3 If the solution to the continuous relaxation is integral, then it is optimal for the original problem.
Properties 2 and 3 result from the fact that the set of feasible solutions for the original MILP is contained in the feasible set of the relaxation.
THIS IS NOT THE CASE FOR MIBLP
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 17 / 44
Example
max x∈Z+
−25x + 20y ≤ 30
x + 2y ≤ 10
2x − y ≤ 15
2x + 10y ≥ 15
3
1
2
5
x
y
F
From the figure, we can see that 1 F ⊆ , F I ⊆ I , andI ⊆
2 F I 6⊆ F
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 18 / 44
Properties of MIBLPs
In this example:
Imposing integrality yields the solution(2, 2), with upper-level objective value 22
From this we can make two important observations:
1 The objective value obtained by relaxing integrality is nota valid bound on the solution value of the original problem since we may have
max (x,y)∈F
c1x + d1y.
2 Even when solutions to max(x,y)∈F c1x + d1y are inF I , they are not necessarily optimal.
Thus, onlyProperty 1 remains valid.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 19 / 44
Bounding Methods
max (x,y)∈
c1x + d1y. (LR)
The resulting bound can be used in combination with a standard variable branching scheme to yield an algorithm that solves (MIBLP).
Unfortunately, the bound is too weak to be effective on interesting problems.
Idea!
Strengthen the linear relaxation with inequalities valid forF I to improve the bound.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 20 / 44
Valid Inequalities for MIBLP
Definition
An inequality defined by(π1, π2, π0) is valid for F I if π1x + π2y ≤ π0 for all (x, y) ∈ F I.
Unless conv(F I) = , ∃ inequalities that are valid forF I, but are violated by some members of.
To generate these inequalities, we must exploit information not contained in the linear description of.
For a point(x, y) to be feasible for an MIBLP, it must satisfy three conditions:
Bilevel Feasibility Conditions
3 y ∈ MI(x).
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 21 / 44
Cutting Plane Approach
max (x,y)∈
c1x + d1y. (LR)
If (x, y) 6∈ X × Y, then Condition 2 is violated⇒ apply MILP cutting plane techniques to separate(x, y) from I .
If (x, y) ∈ X × Y ⇒ check whether Condition 3 is satisfied.
Fix x = x and solve the lower-level problem
min y∈P I
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 22 / 44
Bilevel Feasibility Check
Let y∗ be the solution to (1).
(x, y∗) is bilevel feasible⇒ c1x + d1y∗ is a valid upper bound on the optimal value of the original MIBLP
Either 1 d2y = d2y∗ ⇒ (x, y) is bilevel feasible. 2 d2y > d2y∗ ⇒ generate a valid inequality violated by(x, y).
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 23 / 44
Bilevel Feasibility Cut
]
.
A basic feasible solution(x, y) ∈ I to (LR) is theunique solution to
a′ ix + g′
This implies that {
(x, y) ∈ I | ∑
iy ≤ ∑
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 24 / 44
Bilevel Feasibility Cut (cont.)
iy ≤ ∑
i∈I bi does not contain any other members ofI
⇒ If X = Z n1 andY = Z
n2 (i.e., thepure integer case), we can “push” the hyperplane until it meets the next integer point without separating any additional members ofI.
A Valid Inequality ∑
iy ≤ ∑
Observation 2⇒ inequality is valid forF I.
Similar in spirit to Gomory’s procedure for standard ILPs.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 25 / 44
A Simple Example
max x
min y
{y | −x + y ≤ 2,−2x − y ≤ −2, 3x − y ≤ 3, y ≤ 3, x, y ∈ Z+} .
1 2 3
−x + 2y ≤ 5
−x + 2y ≤ 4
Thebilevel infeasible point (1, 3) is an optimal solution to the LP
max x
{y | −x + y ≤ 2,−2x − y ≤ −2, 3x − y ≤ 3, y ≤ 3, x, y ∈ R+} .
The inequality−x + 2y ≤ 4 separates(1, 3) fromF I.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 26 / 44
Cutting Plane Algorithm (Pure Integer Case)
This approach can be implemented simply within a standard MILP solver framework simply by adding the aforementioned cut generation method.
These inequalities are only valid for the pure integer case,however, and we would like to consider the general case.
The cuts generated by this method are also not very deep. In order to solve problems of interesting size, we would like to generate deeper cuts.
In order to derive stronger disjunctions that can be used forbranching and/or cutting , we must look more closely at violations of Condition 3.
Idea!
When the current relaxed solution is bilevel infeasible, derive disjunctions from the local structure of the value function.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 27 / 44
The Value Function
The value function of an MILP is a functionz : R → R ∪ {±∞} that returns the optimal value of the program as a function of the right-hand side vector.
MILP Value Function
S(d) = {x ∈ Z p + × R
n−p + | Ax ≤ d}.
If we knew the value function, we could replace the lower-level problem with a single non-linear constraint.
Note that the value function is an optimal solution for the subadditive dual for any right-hand side.
Constructing the value function is difficult, but we can takeadvantage of local structure.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 28 / 44
Value Function Characterization
Blair and Jeroslow (1977) and Blair (1995) show thatz is
piecewise polyhedral, and
can be expressed as the sum of the value function of a related pure integer program and a linear interpolation term obtained from the coefficients of the continuous variables.
In Guzelsoy and Ralphs (2008), the case of a MILP with a singleconstraint is considered. Under this special case:
z is composed of a finite number of linear segments on any closedinterval
the slope of each of these linear segments is given by one of two possible values.
For illustration purposes, we now examine the case of a single equality constraint at the lower level.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 29 / 44
Value Function Structure (Single Constraint)
Let C+ = {i ∈ C | ai > 0} andC− = {i ∈ C | ai < 0}, and
ηC = min
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 30 / 44
Exploiting Local Structure
For anyd ≤ d∗,
z(d) ≤ max{f (d∗, ζC), f (d∗, ηC)} = f (d∗, ζC).
Similarly, for anyd ≥ d∗,
z(d) ≤ max{f (d∗, ζC), f (d∗, ηC)} = f (d∗, ηC).
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 31 / 44
In Our Context. . .
Let (x, y) ∈ X × Y be a solution to (LR), and
z(b2 − A2x) = max{d2y | G2y = b2 − A2x, y ∈ Y}.
Suppose,(x, y) is not bilevel feasible (i.e.,d2y > z(b2 − A2x) ). Then:
1 For anyx such thatb2 − A2x ≤ b2 − A2x,
d2y ≤ f (b2 − A2x, ζC).
2 For anyx such thatb2 − A2x ≥ b2 − A2x,
d2y ≤ f (b2 − A2x, ηC).
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 32 / 44
Bilevel Feasibility Branching
Bilevel Feasibility Disjunction
b2 − A2x ≤ b2 − A2x AND d2y ≤ f (b2 − A2x, ζC)
OR
b2 − A2x ≥ b2 − A2x AND d2y ≤ f (b2 − A2x, ηC).
This can immediately be used to develop a stronger branchingscheme when solutions (x, y) ∈ I such thaty 6∈ MI(x) are found.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 33 / 44
A Disjunctive Cut Approach
Consider the two polyhedra that result if we impose this disjunction on the original set of constraints in. This yields the polyhedra:
P1 =
x, y ≥ 0
x, y ≥ 0.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 34 / 44
Constructing the Disjunctive Cut
Let (ui, vi, wi, zi) be multipliers for the constraints in polyhedronPi. The following inequalities are valid forP1:
u1A1x + v1A2x + w1A2x − z1ζCA2x + v1G2y − z1d2y ≥
u1b1 + v1b2 + w1A2x − z1(ζCA2x + d2y∗)
andP2:
u2b1 + v2b2 − w2A2x − z2(ηCA2x + d2y∗).
It is well-known that, given these inequalities, we can construct an inequality αx + βy ≥ γ that is valid for conv(P1 ∪ P2) by selectingα, β, andγ such that
α ≥ max{π1 1, π
2 1}, β ≥ max{π1
2, π 2 2}, and γ ≤ min{π1
0, π 2 0}.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 35 / 44
Linear Description of the Set of Valid Inequalities
Thus, the inequalityαx + βy ≥ γ is valid for conv(P1 ∪ P2) if
α − (u1+
α − (u2+
β − (v1+
γ − (u2+
u1+
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 36 / 44
Cut Generating LP
To find the deepest cut, we can solve thecut generation LP:
min αx + βy − γ
α − (u2+
β − (v1+
γ − (u2+
u1+
similar to that used in constructing lift-and-project cuts.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 37 / 44
Numerical Example
MIBLP Example
+ y1 + 3y2 + 2y3 + 4y4 + 2y6
subject to 4x1 + 2x2 − 4x3 + 3x4 + 6x5 + x6 = 24
x1, x2, x3 ∈ Z+, x4, x5, x6 ∈ R+
y ∈ argmin{ 2y1 + 2y3 + 8y4 + 4y5 + 3y6 :
2x1 − x2 + 4x3 − 4x4 + 3x5 + x6
+ 4y1 − 6y2 + 6y3 + 4y4 − 4y5 + 4 3
y6 = 16,
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 38 / 44
Numerical Example (cont.)
The initial LP lower bound is 3, with a solution of
x4 = 8, y1 = 12.
3 4x1 + 7
3 .
This yields a new lower bound of 3.224 and a solution of
x4 = 0.8163, x5 = 3.5918, y1 = 2.1225.
The optimal value is 3.25 and an optimal solution is
x5 = 4, y1 = 1.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 39 / 44
The General Case
The Jeroslow formula tells us what the local structure lookslike.
In fact, we can derive local structure from the branch-and-bound tree constrcuted when we do the bilevel feasibility check.
There is an obvious combinatorial explosion.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 40 / 44
Jeroslow Formula for General MILP
Let the setE consist of the index sets of dual feasible bases of the linearprogram
min{ 1 M
cCxC : 1 M
ACxC = b, x ≥ 0}
whereM ∈ Z+ such that for anyE ∈ E, MA−1 E aj ∈ Z
m for all j ∈ I.
Theorem 1 (Jeroslow Formula) There is a g ∈ G m such that
z(d) = min E∈E
g(⌊d⌋E) + vE(d − ⌊d⌋E) ∀d ∈ R m with S(d) 6= ∅,
where for E ∈ E, ⌊d⌋E = AE⌊A−1 E d⌋ and vE is the corresponding basic feasible
solution.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 41 / 44
Implementation
The Mixed Integer Bilevel Solver (MibS) implements the branch and bound framework described here using software available from theComputational Infrastructure for Operations Research (COIN-OR) repository.
COIN-OR Components Used
TheCOIN High Performance Parallel Search(CHiPPS) framework to perform the branch and bound.
TheCOIN Branch and Cut(CBC) framework for solving the MILPs.
TheCOIN LP Solver(CLP) framework for solving the LPs arising in the branch and cut.
TheCut Generation Library(CGL) for generating cutting planes within CBC.
TheOpen Solver Interface(OSI) for interfacing with CBC and CLP.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 42 / 44
Conclusions and Future Work
Preliminary testing to date has revealed that these problems can be extremely difficult to solve in practice.
What we have implemented so far has only scratched the surface.
Currently, we are focusing on special cases where we can get traction. Interdiction problems Stochastic integer programs
Much work remains to be done. Preprocessing procedures Heuristics More sophisticated branching rules
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 43 / 44
References I
Audet, C., P. Hansen, B. Jaumard, and G. Savard 1997. Links between linear bilevel and mixed 0-1 programming problems.Journal of Optimization Theory and Applications 93(2), 273–300.
Blair, C. 1995. A closed-form representation of mixed-integer program value functions.Mathematical Programming 71, 127–136.
Blair, C. and R. Jeroslow 1977. The value function of a mixed integer program: I. Discrete Mathematics 19, 121–138.
Dempe, S. 2001. Discrete bilevel optimization problems. Technical Report D-04109, Institut fur Wirtschaftsinformatik, Universitat Leipzig, Leipzig, Germany.
Guzelsoy, M. and T. Ralphs 2008. The value function of a mixed-integer linear program with a single constraint. Technical report, LehighUniversity.
Moore, J. and J. Bard 1990. The mixed integer linear bilevel programming problem. Operations Research 38(5), 911–921.
Ralphs & DeNegre (Lehigh University) MIBLP 19 August 2009 44 / 44
Introduction
Motivation
Applications
Framework