STRONG VALID INEQUALITIES FOR MIXED-INTEGER NONLINEAR...
Transcript of STRONG VALID INEQUALITIES FOR MIXED-INTEGER NONLINEAR...
STRONG VALID INEQUALITIES FOR MIXED-INTEGER NONLINEAR PROGRAMSVIA DISJUNCTIVE PROGRAMMING AND LIFTING
By
KWANGHUN CHUNG
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
2010
c© 2010 Kwanghun Chung
2
To my father, Youngkwan Chung, and my mother, Haeja Hwangbo
3
ACKNOWLEDGEMENTS
It is my great pleasure to thank all the people who helped me successfully complete
this thesis. First, I would like to deeply thank my advisor, Dr. Jean-Philippe P. Richard,
for advising me with enthusiasm and patience during my Ph.D. study. He always inspired
and encouraged me to pursue my research whenever I was frustrated with difficulties.
While I worked with him, I learned a lot about Operations Research from his knowledge
and academia from his experience. I set him as a role model that I wish to follow if I work
in academia.
I would like to thank my co-advisor, Dr. Mohit Tawarmalani, for his guidance and
discussions that make it possible for me to write this thesis. His critical and rigorous way
of thinking motivated me to overcome various obstacles. I am also thankful to Dr. Panos
Pardalos, Dr. J. Cole Smith, and Dr. William Hager, for serving on my committee and
giving me helpful comments to improve the quality of this thesis.
My life as a doctoral student for the last few years has been happy and pleasant
because of many of my friends at Purdue University and the University of Florida. In
particular, I appreciate Seokcheon, Byungcheol, Keumseok, Daiki, Kyungdoh, Sangbok,
and all members of Purdue Korean Industrial Engineers for their help and support that
alleviated me from the painful research. I also appreciate Chanmin and Youngwoong as
well as my fellow office mates for many kind favors during my stay in Florida.
Finally, I would like to show my gratitude to all of my families, Youngkwan, Haeja,
Hyunjoo, and Jaehun, whose sincere love and constant supports are the source of my life.
4
TABLE OF CONTENTS
page
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
CHAPTER
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.1 Mixed-Integer Nonlinear Program (MINLP) . . . . . . . . . . . . . . . . . 121.1.1 Models and Applications . . . . . . . . . . . . . . . . . . . . . . . . 121.1.2 Solution Methodologies to Global Optimization . . . . . . . . . . . 14
1.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.2.1 Well-Solved Optimization Problems . . . . . . . . . . . . . . . . . . 151.2.2 Relaxations and Convexifications . . . . . . . . . . . . . . . . . . . 20
1.3 Branch-and-Cut in MINLP . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.3.1 Bounding Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.3.2 Branching Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.3.3 Cutting Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271.3.4 Domain Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.4 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2 CONVEX RELAXATIONS IN MILP and MINLP . . . . . . . . . . . . . . . . . 30
2.1 Convexification Methods in MINLP . . . . . . . . . . . . . . . . . . . . . . 302.1.1 Convex Envelopes and Convex Extensions . . . . . . . . . . . . . . 312.1.2 Reformulation and Relaxation . . . . . . . . . . . . . . . . . . . . . 33
2.2 Cutting Plane Techniques for Mixed-Integer Linear Program (MILP) . . . 372.2.1 Disjunctive Programming . . . . . . . . . . . . . . . . . . . . . . . . 402.2.2 Lifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.2.1 Sequential lifting . . . . . . . . . . . . . . . . . . . . . . . 472.2.2.2 Sequence-independent lifting . . . . . . . . . . . . . . . . . 50
3 MOTIVATION AND RESEARCH STATEMENTS . . . . . . . . . . . . . . . . 51
3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.2 Problem Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.2.1 Strong Valid Inequalities for Orthogonal Disjunctions and BilinearCovering Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.2.2 Lifted Inequalities for 0-1 Mixed-Integer Bilinear Covering Sets withBounded Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5
4 STRONG VALID INEQUALITIES FOR ORTHOGONAL DISJUNCTIONSAND BILINEAR COVERING SETS . . . . . . . . . . . . . . . . . . . . . . . . 56
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.2 Convexification of Orthogonal Disjunctive Sets . . . . . . . . . . . . . . . . 584.3 Convex Extension Property . . . . . . . . . . . . . . . . . . . . . . . . . . 844.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5 LIFTED INEQUALITIES FOR 0-1 MIXED-INTEGER BILINEAR COVERINGSETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1065.2 Basic Polyhedral Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1095.3 Lifted Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.3.1 Sequence-Independent Lifting for Bilinear Covering Sets . . . . . . . 1215.3.2 Lifted Inequalities by Sequence-Independent Lifting . . . . . . . . . 123
5.3.2.1 Lifted bilinear cover inequalities . . . . . . . . . . . . . . . 1275.3.2.2 Lifted reverse bilinear cover inequalities . . . . . . . . . . 137
5.3.3 Inequalities through Approximate Lifting . . . . . . . . . . . . . . . 1445.4 New Facet-Defining Inequalities for a Single-node Flow Model . . . . . . . 1545.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6 A COMPUTATIONAL STUDY OF LIFTED INEQUALITIES FOR 0-1 BILINEARCOVERING SETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1646.2 Generalization to Bilinear Constraints with Linear Terms . . . . . . . . . . 164
6.2.1 Generalized Lifted Bilinear Cover Inequalities . . . . . . . . . . . . 1726.2.2 Generalized Lifted Reverse Bilinear Cover Inequalities . . . . . . . . 179
6.3 Preliminary Computational Study . . . . . . . . . . . . . . . . . . . . . . . 1826.3.1 Computational Environments . . . . . . . . . . . . . . . . . . . . . 1836.3.2 Testing Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1836.3.3 Separation Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 1856.3.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
6.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7 CONCLUSIONS AND FUTURE RESEARCH . . . . . . . . . . . . . . . . . . . 194
7.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1947.2 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
APPENDIX
A LINEAR DESCRIPTION OF THE CONVEX HULL OF A BILINEAR SET . . 197
B LINEAR DESCRIPTION OF THE CONVEX HULL OF A FLOW SET . . . . 200
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
6
BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
7
LIST OF TABLES
Table page
6-1 Parameters of the random instances for three test sets . . . . . . . . . . . . . . 184
6-2 Characteristics of the three test sets . . . . . . . . . . . . . . . . . . . . . . . . . 185
6-3 Objective values to the test instances . . . . . . . . . . . . . . . . . . . . . . . . 186
6-4 Performance of lifted cuts on small size instances . . . . . . . . . . . . . . . . . 190
6-5 Performance of lifted cuts on medium size instances . . . . . . . . . . . . . . . . 191
6-6 Performance of lifted cuts on large size instances . . . . . . . . . . . . . . . . . . 192
8
LIST OF FIGURES
Figure page
1-1 Branch-and-Cut framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2-1 Cutting plane algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3-1 Geometric illustration of S, conv(S), S1 and S2 . . . . . . . . . . . . . . . . . . 52
4-1 Illustration of Theorem 4.1 with (a) J1 6= ∅, J2 6= ∅ (b) J2 = ∅ (c) J1 = J2 = ∅ . 70
4-2 Facet-defining inequalities for conv(BI
i
). . . . . . . . . . . . . . . . . . . . . . 93
5-1 Lifting function PC(w) of (5–44) . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5-2 Deriving lifting coefficients for Example 5.3 . . . . . . . . . . . . . . . . . . . . 134
5-3 Deriving lifting coefficients for Example 5.5 . . . . . . . . . . . . . . . . . . . . 144
5-4 A valid subadditive approximation Ψ(w) of Φ(w) for Example 5.6. . . . . . . . . 151
6-1 Lifting function LC(w) of (6–19) . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6-2 Deriving lifting coefficients for Example 6.1 . . . . . . . . . . . . . . . . . . . . 176
9
Abstract of Dissertation Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of theRequirements for the Degree of Doctor of Philosophy
STRONG VALID INEQUALITIES FOR MIXED-INTEGER NONLINEAR PROGRAMSVIA DISJUNCTIVE PROGRAMMING AND LIFTING
By
Kwanghun Chung
August 2010
Chair: Jean-Philippe. P. RichardMajor: Industrial and Systems Engineering
Mixed-Integer Nonlinear Programs (MINLP) are optimization problems that have
found applications in virtually all sectors of the economy. Although these models can be
used to design and improve a large array of practical systems, they are typically difficult
to solve to global optimality. In this thesis, we introduce new tools for the solution of such
problems. In particular, we develop new procedures to construct convex relaxations of
certain MINLP problems. These relaxations are stronger than those currently known for
these problems and therefore provide improvements in the solution of MINLPs through
branch-and-bound techniques. There are three main components to our contributions.
First, we derive a closed-form characterization of the convex hull of a generic
nonlinear set, when the convex hull of this set is completely determined by orthogonal
restrictions of the original set. Although the tools used in our derivation include
disjunctive programming and convex extensions, our characterization does not introduce
additional variables. We develop and apply a toolbox of results to check the technical
assumptions under which this convexification tool can be employed. We demonstrate
its applicability in integer programming by providing an alternate derivation of the
split cut for mixed-integer polyhedral sets and by finding the convex hull of various
mixed/pure-integer bilinear sets. We then develop a key result that extends the utility
of the convexification tool to relaxing nonconvex inequalities, which are not naturally
disjunctive, by providing sufficient conditions for establishing the convex extension
10
property over the non-negative orthant. We illustrate the utility of this result by deriving
the convex hull of a continuous bilinear covering set over the non-negative orthant.
Second, we study the 0−1 mixed-integer bilinear covering set. We show that the
convex hull of this set is polyhedral and we provide characterizations for its trivial
facets. We also obtain a complete convex hull description when it contains only two
pairs of variables. We then derive three families of facet-defining inequalities via
sequence-independent lifting techniques. Two of these families have an exponential
number of members. Next, we relate the polyhedral structure of the 0−1 mixed-integer
bilinear covering set to that of certain single-node flow sets. As a result, we obtain
new facet-defining inequalities for flow sets that generalize well-known lifted flow cover
inequalities from the integer programming literature.
Third, we evaluate the strength of the lifted inequalities we derive for 0−1 mixed-integer
bilinear covering sets inside of a branch-and-cut framework. To this end, we first generalize
our theoretical results to bilinear covering sets that have additional linear terms. We
then present separation techniques for lifted inequalities and report computational results
obtained when using these procedures on several families of randomly generated problems.
11
CHAPTER 1INTRODUCTION
In this chapter, we give a brief overview of Mixed-Integer Nonlinear Programming
models and their applications. We then describe general methodologies to solve them.
After discussing basic concepts in mathematical programming, we describe in more detail
the branch-and-bound approach to MINLP. We conclude this chapter by describing the
overall structure of this thesis.
1.1 Mixed-Integer Nonlinear Program (MINLP)
1.1.1 Models and Applications
A Mixed-Integer Nonlinear Program (MINLP) is an optimization problem of the form:
min f(x)
(P ) s.t. gi(x) ≤ 0 ∀i ∈ M,
xj ∈ Z+ ∀j ∈ I ⊆ N := {1, . . . , n},xj ∈ R+ ∀j ∈ N \ I,
where
1. f : Rn 7→ R,
2. gi : Rn 7→ R, ∀i ∈ M .
Throughout the thesis, we restrict our attention to problems (P ), where the functions f
and gi are continuous and factorable.
Definition 1.1 (Factorable Function [89]). A function is factorable if it is defined by
a finite recursive composition of binary sums, binary products, and a given collection of
univariate intrinsic functions.
For example, the function f(x) = x1ex2 + cos(x1 + x2)x3 is factorable since it can be
expressed as
f(x) = S(P(x1, h1(x2)
),P
(h2(S(x1, x2), x3)
)),
where
12
• S(x, y) = x+ y represents a binary sum,
• P(x, y) = x ∗ y represents a binary product,
• h1(x) = ex and h2(x) = cos(x) are intrinsic univariate functions.
The class of factorable functions contains most functions encountered in practical
applications; see McCormick [86].
We refer to x ∈ Rn as the decision variables of (P ). We refer to f(x) as the objective
function of (P ) and to gi(x) ≥ 0 for i ∈ M as the constraints of (P ). If there are no
constraints (i.e., M = ∅), we say that problem (P ) is unconstrained.
We define
S :={x ∈ Z|I|
+ × Rn−|I|+
∣∣∣ gi(x) ≤ 0 ∀i ∈ M}
to be the feasible region of (P ). A vector x ∈ S is said to be a feasible solution of (P ).
Further, problem (P ) is said to be feasible if S 6= ∅, and infeasible if S = ∅.The goal of problem (P ) is to find a vector x∗ ∈ S, called a (globally) optimal solution
of (P ) whose objective value f(x∗) is minimized over the set S, i.e.,
f(x∗) ≤ f(x) ∀x ∈ S.
We refer to f(x∗) as the optimal value of (P ). A vector x ∈ S is said to be locally optimal
if there exists an ε > 0 such that
f(x) ≤ f(x) ∀x ∈ S ∩{x ∈ Rn
∣∣∣ ‖x− x‖ ≤ ε}.
In this thesis, we will use the terms optimal and globally optimal interchangeably.
When f is linear (i.e., f(x) = cTx) and all of the functions gi are affine (i.e.,
gi(x) = (ai)Tx + bi), (P ) is said to be a Mixed-Integer Linear Program (MILP). When
I = ∅, (P ) is referred to as a Linear Program (LP). When I = N , (P ) is said to be a Pure
Integer Program (IP). Finally, when all variables xj for j ∈ I are restricted to be binary,
(P ) is commonly known as a 0−1 Mixed-Integer Linear Program or Binary Mixed-Integer
Linear Program (BMILP). While LP problems can be solved in polynomial-time, solving
13
general MILPs is NP-hard; see Cook [35]. Note however that when the number of variables
is fixed, Lenstra [76] describes a polynomial-time algorithm for IP.
In MINLP models, continuous variables are typically used to represent physical
quantities while binary variables are used to describe managerial decisions. Functions f(x)
and gi(x) are used to capture the (possibly nonlinear) physical relations between these
variables. As a result, MINLP problems arise in a wide variety of practical applications
and are used to model decision problems in business and engineering. Successful
applications of MINLP can be found in a number of fields such as telecommunication
networks [25], supply chain design and management [135], portfolio optimization [39],
chemical processes [27, 53, 70], protein folding [93], molecular biology [77], quantum
chemistry [78], and unit commitment problems [142].
1.1.2 Solution Methodologies to Global Optimization
Global optimization of MINLPs is typically difficult when (1) there are integrality
restrictions on a subset of variables (i.e., I 6= ∅) and (2) there are nonconvex functions (see
definition in Section 1.2.1) either in the objective or in the constraints. General solution
methodologies to obtain globally optimal solutions for MINLPs can be classified as either
deterministic or stochastic; see Neumaier [94] for a survey of existing solution methods.
Deterministic algorithms include branch-and-bound [51, 85, 103], outer-approximation
[48, 65, 68], cutting planes [124, 126], and decomposition [125, 129]. Stochastic approaches
include random search [140], genetic algorithms [134], and clustering algorithms [71].
For detailed presentations of these approaches, we refer the interested reader to the
books of Horst and Pardalos [67] and Horst and Tuy [69]. In this thesis, we will focus on
branch-and-bound approaches for MINLP.
1.2 Preliminaries
In this section, we briefly review fundamental results in mathematical programming
that are used throughout this thesis.
14
1.2.1 Well-Solved Optimization Problems
Since MINLP is known to be NP-hard, it is unlikely that we will ever be able to
design an algorithm that solves all instances of (P ) to global optimality in polynomial
time. However, there are families of problems (P ) that can be solved efficiently. We
introduce two such families next. To this end, we introduce next the notion of convex set
and convex function.
Definition 1.2 (Convex Combination). Let x1, . . . , xp be vectors in Rn. We refer to any
point x obtained as∑p
j=1 λjxj where λj ∈ R+ for j = 1, . . . , p and
∑pj=1 λj = 1 as a convex
combination of x1, . . . , xp.
Definition 1.3 (Convex Set). A set S ⊆ Rn is said to be convex if, ∀x1, x2 ∈ S, all convex
combinations of x1 and x2 belong to S, i.e.,
λx1 + (1− λ)x2 ∈ S, ∀λ ∈ [0, 1].
Definition 1.4 (Convex Function). Let S be a nonempty convex subset of Rn. A function
f : S 7→ R is said to be convex if, ∀x1, x2 ∈ S,
f(λx1 + (1− λ)x2
)≤ λf(x1) + (1− λ)f(x2), ∀λ ∈ [0, 1].
Convex sets and convex functions can be related in different ways; see Section 3.1 of
Bazaraa et al. [24] for a textbook discussion. We present one such relation next.
Definition 1.5 (Level Set). Given a function f : Rn 7→ R and a scalar α ∈ R, we refer to
the set
Sα ={x ∈ Rn
∣∣∣ f(x) ≤ α}
as the α level set of f .
Proposition 1.1. Let f : Rn 7→ R be a convex function. Then, the α level set of f is a
convex set for each value of α ∈ R.
15
We now focus on a subfamily of problems (P ) of the form
min f(x)
(CP ) s.t. gi(x) ≤ 0 ∀i ∈ M,
xj ∈ R+ ∀j ∈ N,
where the functions f(x) and gi(x) for i ∈ M are convex. Proposition 1.1 implies that the
feasible region of (CP ) is a convex set since the intersection of convex sets is convex. We
refer to such problems as convex programs. While (CP ) cannot typically be solved with an
analytical formula, (CP ) has many good properties that make finding a globally optimal
solution easier than finding that of other problems (P ). In particular, it can be shown that
every local optimal solution of (CP ) is also globally optimal. There are various methods
to solve (CP ) to global optimality; see Boyd and Vandenberghe [29] and Nesterov and
Nemirovskii [92]. We briefly comment on two of these methods.
The ellipsoid method was formally developed by Yudin and Nemirovski [136] although
similar ideas had been introduced earlier by Shor [112]. The ellipsoid method generates
a sequence of ellipsoids containing an optimal solution of the problem whose volumes are
decreasing. At each iteration, the algorithm splits the current ellipsoid in half and use
problem information to determine which half of the ellipsoid contains an optimal solution.
A new ellipsoid (of smaller volume) is then built around the selected half-ellipsoid and the
process is iterated.
Interior point algorithms form another family of solution approaches for convex
programs. The idea originates from the work of Fiacco and McCormick [52] in the
1960’s. Among others, the authors include barrier functions in the objective to take into
account the feasible region of the problem (CP ). Although progress on these techniques
remained limited through the 1980’s, the discovery of a polynomial-time algorithm for
linear programs by Karmarkar [73] led to a revival of interests in barrier methods. In
particular, Nesterov and Nemirovskii [92] later showed that polynomial time convergence
can be achieved for any convex program that can be equipped with an easily computable
16
self-concordant barrier functions. Simple self-concordant barriers are known for many
convex programs; see Nesterov and Nemirovskii [92]. As a result, convex programs are
typically thought to be simple optimization problems to solve.
A particular type of convex programs that is very simple is the linear programming
problem. This problem is a variant of (P ) of the form:
min cTx
(LP ) s.t. Ax ≤ b
x ∈ Rn+.
Before we discuss algorithms to solve LPs, we introduce some basic concepts of polyhedral
theory that we will use later in this thesis.
Definition 1.6 (Polyhedron and Polytope). A polyhedron Q ⊆ Rn is a set of points in Rn
that can be described as the intersection of a finite number of half-spaces, i.e.,
Q ={x ∈ Rn
∣∣∣ Ax ≤ b}, (1–1)
where A ∈ Rm×n and b ∈ Rm. A polyhedron is said to be bounded if there exists M ∈ R+
such that sup{‖x‖
∣∣∣ x ∈ Q}
< M . We typically refer to a bounded polyhedron as a
polytope.
It is clear that the feasible region of an LP is a polyhedron. When studying MILPs,
we will typically consider rational polyhedra, i.e., polyhedra that can be defined with
A ∈ Qm×n and b ∈ Qm. When studying a polyhedron, some feasible solutions are of
particular interest.
Definition 1.7 (Extreme Point). A point x in a polyhedron Q is said to be an extreme
point of Q if whenever x = 12x1 + 1
2x2 for some x1, x2 ∈ Q, then x = x1 = x2.
17
Definition 1.8 (Extreme Ray). Given the nonempty polyhedron Q defined in (1–1), we
define the recession cone of Q as
Q0 ={r ∈ Rn
∣∣∣ Ar ≤ 0}.
A non-zero vector r in Q0 is said to be a ray of Q. Further, a ray r is said to be an
extreme ray of Q if whenever r = 12r1 + 1
2r2 for some r1, r2 ∈ Q0, then r = r1 = r2.
Polyhedra can be represented using extreme points and extreme rays as presented in
Theorem 1.1.
Theorem 1.1 (Minkowski’s Theorem [88]). If Q is a nonempty polyhedron as defined in
(1–1) and rank(A) = n, then
Q =
x ∈ Rn
∣∣∣∣∣∣∣∣∣∣∣∣∣
x =∑
k∈K λkxk +
∑j∈J µjr
j,
∑k∈K λk = 1,
λk ≥ 0, ∀ k ∈ K,
µj ≥ 0, ∀ j ∈ J,
where {xk}k∈K is the set of extreme points of Q and {rj}j∈J is the set of extreme rays of
Q.
Using Theorem 1.1, we can easily verify the following result.
Theorem 1.2. If (LP ) has an optimal solution, then at least one of the extreme points of
Q must be an optimal solution.
Using the fact that an optimal solution to (LP ) can be found among the extreme
points of its feasible region, Danzig [44] developed in 1947 the first algorithm to solve
general LPs: the simplex algorithm. We mention that Kantorovich had proposed earlier in
1939 a method to solve a restricted form of LPs; see [72] for a translation.
The simplex algorithm relies on the observation that every extreme point of LPs of
the form
(LP ′) min{cTx
∣∣∣ Ax = b, x ∈ Rn+
}, (1–2)
18
can be computed as
xB = A−1B b, xN = 0, (1–3)
where B ⊆ N , N = N \ B, AB is an invertible submatrix of A formed by the columns of
A corresponding to B, and A−1B b ≥ 0. In (1–3), the variables xj for j ∈ B are called basic
while the variables xj for j ∈ N are called nonbasic.
The simplex algorithm searches for an optimal solution of (LP ′) by creating a
sequence of bases B1,B2, . . . ,Bk that are such that (i) |Bi ∩ Bi+1| = |Bi| − 1 = |Bi+1| − 1
and (ii) the basic solutions corresponding to the Bis are feasible and nonincreasing. The
operation of moving from one basis to the next is called pivoting.
Using an appropriate pivoting strategy such as Bland’s rule [28], the simplex
algorithm obtains an optimal solution to (LP ′) in a finite number of iterations. Over
the years, many different pivoting rules have been developed but none of them has been
shown to provide a polynomial-time algorithm to solve LPs. Nonetheless, the simplex
algorithm is typically very efficient at solving practical LP problems. In 1979, Khachiyan
[74] proposed the first polynomial time algorithm for LPs. This algorithm is a specialized
variant of the ellipsoid algorithm. Although the practical performance of this algorithm is
poor, it is remarkable that it does not depend directly on the number of constraints the
LP has. This feature has important consequences in the study of integer programs that
we will comment on in Section 2.2. Karmarkar [73] introduced the first algorithm for the
solution of LPs that has good performance in both theory and practice. Improvements
and variants of this algorithm were subsequently discovered; see Wright [133]. Nowadays,
commercial software such as CPLEX [40] use a combination of simplex algorithm and
interior points methods to solve LPs and can solve large instances of practical problems
very quickly; see Mittelmann [90]. As a result, they can be used as the working horse for
the solution of other more difficult problems.
19
1.2.2 Relaxations and Convexifications
One of the ways to prove that z is the optimal value of (P ) is to show z is both a
lower and a upper bound on the optimal value z∗. Upper bounds (also called primal
bounds) can be obtained from any feasible solution xF ∈ S since z∗ ≤ f(xF ). To
obtain tight upper bounds, we need to find good feasible solutions, which can be difficult
depending on the original problem (P ). Heuristic approaches are typically used for this
purpose. Finding lower bounds (also called dual bounds) requires other techniques. A
common approach is to use relaxations. We give a formal definition of relaxation next.
Definition 1.9 (Relaxation). Given an optimization problem
(P ) z∗ = min{f(x) | x ∈ S},
the related optimization problem
(RP ) zR = min{f(x) | x ∈ R}
is said to be a relaxation of (P ) if
1. S ⊆ R,
2. f(x) ≥ f(x) for all x ∈ S.
Definition 1.9 states that relaxations can be obtained in two ways; (i) by enlarging
the feasible region S and/or (ii) by underestimating the objective function f(x) over S.
Lower bounds can be obtained by solving relaxations, as the following result suggests.
Proposition 1.2. If (RP ) is a relaxation of (P ), then zR ≤ z∗.
Although optimal solutions of relaxations are not always optimal for the original
problem, they sometimes are. The following result handles this issue.
Proposition 1.3. If x∗ is an optimal solution of (RP ), x∗ ∈ S, and f(x∗) = f(x∗), then x∗
is an optimal solution of (P ).
The derivation of a relaxation is particularly useful if the problem associated with
the relaxation is substantially easier to solve than the original problem and the relaxation
20
value zR is close to z∗. Given the fact that convex programs are typically easy to solve, it
makes sense to study how to construct convex relaxations of optimization problems. To
obtain the tightest possible relaxation bound, it is best to replace S by the smallest convex
set that contains S, which is called convex hull. An alternate definition is as follows.
Definition 1.10 (Convex Hull). Let S ⊆ Rn. We refer to the set of all convex combina-
tions of points in S, which we denote by conv(S), as the convex hull of S.
When underestimating the objective function f(x), it is also clear that, to obtain
the tightest relaxation possible, we should replace f(x) with the tightest convex lower
approximation of f(x), which is commonly known as convex envelope; see Falk [49],
Rockafellar [102], Horst [66], and Horst and Tuy [69].
Definition 1.11 (Convex Envelope). Let S ⊆ Rn be convex and compact, and let
f : S 7→ R be lower semi-continuous on S. A function convenv(f) : S 7→ R is called the
convex envelope of f on S, denoted as convenv(f), if it satisfies
1. convenv(f(x)) is convex on S,
2. convenv(f(x)) ≤ f(x), ∀x ∈ S,
3. there is no function g : S 7→ R satisfying (i), (ii), and convenv(f(x)) < g(x) for
some x ∈ S.
Note that it is easily seen from Condition 3 that the convex envelope is uniquely
determined, if it exists.
In theory, a very strong relaxation of the problem (P ) can be obtained by replacing
the feasible region S with conv(S) and the objective function f(x) with convenv(f(x)).
However, such a construction is typically not practical as deriving convex hulls of
nonconvex sets and convex envelopes of nonconvex functions is often difficult. Therefore,
simpler and weaker convex relaxations are typically derived. We call these relaxations
convexifications.
21
Definition 1.12 (Convexification). Given a nonconvex problem
(NCP )z∗ = min f(x)
s.t. x ∈ S,
a problem of (P )
(CRP )zR = min f(x)
s.t. x ∈ R,
is said to be a convexification of (NCP ) if
1. f(x) is a convex underestimator of f(x),
2. S ⊆ R and R is convex.
In Definition 1.12, we require a convexification to have both a convex objective
function and a convex feasible region. It therefore can be solved by a variety of algorithms.
Since LPs can be solved extremely efficiently, it is often helpful to require in Definition 1.12
that f(x) be linear and that R be polyhedral. If so, we refer to the resulting convexification
as a linearization. Since every convex set can be represented as an intersection of
(possibly infinitely many) half-spaces, a linearization can always be constructed from
a convexification. Linearizations have been typically preferred to convexifications in
commercial solvers (see Adjiman et al. [3], LINDO Systems Inc. [80], Sahinidis and
Tawarmalani [105], and Belotti et al. [26]) because they tend to be faster and algorithms
are more stable. An example of convexification for MILPs of the form
(MILP )
min cTx
s.t. Ax = b
x ∈ Z|I|+ × Rn−|I|
+ ,
is the linear program
(RMILP )
min cTx
s.t. Ax = b
x ∈ Rn+.
22
obtained dropping integrality restrictions on variables xj for j ∈ I. This relaxation
is called the LP relaxation of (MILP ). We will describe general methods to generate
convexifications for MILPs and MINLPs in Chapter 2. For detailed discussions about
convexification techniques, we refer the interested reader to the book by Tawarmalani and
Sahinidis [121].
1.3 Branch-and-Cut in MINLP
For a nonconvex MINLP problem where f and/or gi are nonconvex, finding
globally optimal solutions is a challenging problem that has attracted much attention.
Branch-and-bound is one of the methods described in Section 1.1.2 that have been
proposed for solving this problem. Branch-and-bound methods are implicit enumeration
techniques based on the divide-and-conquer strategy and the concept of convexification.
A globally optimal solution of the convexification is first obtained. If it satisfies the
conditions of Proposition 1.3, it is optimal for the problem. Otherwise, the relaxed
solution only yields a lower bound on z∗. When this happens, the feasible region is divided
into non-overlapping subsets for which stronger convex relaxations can be built. An
optimal solution to the initial problem can then be obtained by selecting the best among
the globally optimal solutions of the subproblems. Since subproblems are likely to be
nonconvex problems, globally optimal solutions of these subproblems are obtained by
applying the procedure recursively. As a result, a tree of subproblems is created that is
called branch-and-bound tree. There are three cases when the branch-and-bound search of
a current node is stopped, an operation known as fathoming of a node:
1. the relaxation is infeasible,
2. the objective value of the current relaxation is larger than the value of a known
feasible solution,
3. the solution of the relaxation is globally optimal for subproblems; see Proposition 1.3.
The branch-and-bound process terminates when all nodes are fathomed (i.e., when the
lower bound zL is equal to the upper bound zU). In MILP, this process is finite (i.e.,
23
zL = zU occurs in a finite number of steps) and is convergent (i.e., zU − zL → 0) when
variables are bounded. In MINLP, for a given tolerance ε > 0, the search process typically
terminates when zU − zl ≤ ε. Provided that convexification used in the tree is finitely
consistent, i.e., any unfathomed partition can be further refined at every iteration, the
branch-and-bound process can terminate after finitely many steps; see Horst and Tuy [69].
Land and Doig [75] in 1960 introduced the first branch-and-bound algorithm for
pure integer linear programs. Dakin [43] and Driebeek [47] extended it to mixed-integer
linear programming problems. Since then, branch-and-bound has become a general
solution method in MILP that has been successfully implemented in commercial software
such as CPLEX [40]. In MILP problems, branch-and-bound proceeds by recursively
solving LP relaxations of the problem (see Section 1.2.1). Since LP relaxations can be
weak, new linear inequalities derived from the problem structure are typically added
to cut off fractional solutions. These additional valid inequalities are called cuts or
cutting planes. The use of cuts is known to be one of the most important ingredients to
the efficient solution of MILP with branch-and-bound. The addition of cuts inside the
branch-and-bound framework yields a family of methods called branch-and-cut ; see Martin
[84].
Falk and Soland [51] introduced nonlinear branch-and-bound for continuous
global optimization. For factorable nonconvex problems, McCormick [85] proposed
a convexification scheme for factorable problems under the assumption that tight
convex and concave envelopes are known for the underlying univariate functions.
Ryoo and Sahinidis [103] introduced a branch-and-reduce algorithm that uses domain
reduction techniques during the process. Androulakis et al. [5] developed an αBB
branch-and-bound method that relies on the twice differentiable functions. Tawarmalani
and Sahinidis [122] introduced the idea of building and solving polyhedral-based
relaxations in branch-and-bound for global optimization and Tawarmalani and Sahinidis
[123] implemented this idea. Currently, nonlinear branch-and-bound methodologies have
24
been implemented in various global optimization software; see Adjiman et al. [3], Sahinidis
and Tawarmalani [105], LINDO Systems Inc. [80], and Belotti et al. [26].
Branch-and-cut is not a specific algorithm but a general framework since it relies
on four main components that can be adapted. These four components are: bounding
that obtains lower and upper bounds on the optimal value of relaxations, branching
that divides a problem into smaller subproblems, cutting that adds valid inequalities
to formulations, and domain reduction, also known as bound tightening, that reduces
the search region. A key component in the success of branch-and-cut algorithm is the
quality of bounds obtained from the relaxation. To obtain better bounds, it is necessary to
develop tighter convexifications. This is the ultimate goal of this thesis as we will discuss
more in Chapters 2 and 3. Next, we describe in more detail the branch-and-cut framework
to illustrate the setting in which our results are applied; see Figure 1-1. We discuss each of
its components in the following sections.
1.3.1 Bounding Scheme
In every branch-and-bound node, both lower and upper bounds on the optimal value
are computed and/or updated. Upper bounds are obtained from feasible solutions that are
found using some upper bounding procedures or heuristic algorithms. Lower bounds are
computed through the solution of a convexification of the problem.
1.3.2 Branching Scheme
For MILPs, dividing feasible regions into subproblems is simple. Assuming an LP
relaxation has been solved and the optimal solution x∗ is fractional, we can choose any
integer variable xi whose optimal value x∗i is fractional and then create two subproblems:
one obtained by adding the constraint xi ≤ bx∗i c and the other obtained by adding the
constraint xi ≥ bx∗i c+1. This scheme, called dichotomy branching, ensures that the current
LP solution does not survive in any of the subsequent convexifications of the subproblems
and therefore ensures that the branch-and-bound search progresses. We mention that
other branching schemes such as GUB branching or constraint branching can be used.
25
Input: Problem (P ) and set of integer variables IOutput: An optimal solution x∗ with optimal value z∗
Initialization;L ← {P}, x∗ ← ∅, z∗ ← +∞;
while L 6= ∅ doCheck termination criteria;Update list L;if zPi ≥ z∗ for Pi ∈ L then
L ← L \ {Pi};end
Node Selection;Select Pi ∈ L and let L ← L \ {Pi};
Domain Reduction;Tighten the bounds on variables of Pi;
Construct a convex relaxation RPi of Pi;while Cut Generation needs to be performed do
Obtain zRPi and xRPi by solving RPi;Pruning;by Infeasibility;by Bounds if zRPi ≥ z∗;by Global Feasibility;
Cut Generation;if ∃ a violated cut then
add to the formulation;end
endPrimal Heuristics;Branching;Choose a variable xj;Choose a branching point xb
j;Create subproblems: Pi− and Pi+;L ← L ∪ {Pi−, Pi+};
end
Figure 1-1. Branch-and-Cut framework
26
Observe that even when only applying dichotomy branching, algorithmic decisions must
be made about the selection of both the branching variable (i.e., which fractional variables
will be branched on) and the branching point; see Achterberg et al. [1], Linderoth and
Savelsbergh [79]. In MILP, while the latter is straightforward, the former is not and
different strategies might result in dramatically different trees.
Similar approaches can be used in MINLP. In the selection of branching variables,
integer variables typically take priority over continuous variables. Hence, if there are
integer variables with fractional values, then one of these variables is selected first for
branching. To select among several integer variables, standard MILP techniques are
used. Note that it could happen that x∗i has integer values for all i ∈ I, but x∗
i is not
feasible for the other relaxed constraints. Hence, a measure of infeasibility for solutions
is introduced in MINLP. To select a branching variable among continuous variables,
Tawarmalani and Sahinidis [122] propose to use violation transfer and Belotti et al. [26]
extend the reliability branching used in MILP. After the selection of branching variables,
the branching point can be chosen using several rules such as bisection rule, ω rule, or
other variants [103, 109, 116]. For bilinear programs, an alternative selection rule for the
branching point is provided in [116].
1.3.3 Cutting Scheme
Since the initial relaxation created at the root node is typically weak, it is important
to improve it by adding strong inequalities. In MILP, this can be done through the
addition of cutting planes that separate a fractional solution from the feasible region. We
will discuss strong valid inequalities for MILPs and will describe two well-known tools to
generate them in Section 2.2. Similarly, the performance of the branch-and-bound search
in MINLP can be improved if relaxations are tightened using strong inequalities. While
cuts must be linear inequalities in MILP, convex constraints can also be used in MINLP as
long as they are valid and improve bounds; see Tawarmalani and Sahinidis [123].
27
1.3.4 Domain Reduction
Domain reduction for a variable x is the process of reducing the interval [xl, xu]
where x is considered while guaranteeing that an optimal solution is not cut off. As the
search space is reduced through this procedure, relaxations obtained typically become
stronger. One such procedure is the optimality-based range deduction that uses the current
linearization to improve the bounds on variables; see Shectman and Sahinidis [109] and
Zamora and Grossmann [137]. It is typically used for auxiliary variables introduced in the
reformulation phase and applied only at the root node or up to a limited depth. On the
other hand, feasibility-based range deduction similar to interval propagation in Constraint
Programming is performed at all nodes of the tree; see Shectman and Sahinidis [109].
Domain reduction has also been widely used in MILP; see Savelsbergh [106]. Belotti et al.
[26] developed aggressive bounds tightening which is similar to probing techniques in MILP
[106, 120]. Reduced-cost bounds tightening, introduced for solving MILP problems [91], has
also been extended to MINLP by Ryoo and Sahinidis [103].
1.4 Outline of the Dissertation
In this thesis, we introduce new tools to improve the convexifications used in MINLP.
In particular, we study nonlinear sets that appear as relaxations of MINLP problems. The
overall structure of the thesis is as follows.
In Chapter 2, we give an overview of techniques that are used in integer programming
and global optimization to produce convexifications of nonconvex sets. We focus on
factorable relaxation techniques since they are most related to our work. We also describe
how to generate strong cutting planes for general MILP problems using disjunctive
programming and lifting techniques in Sections 2.2.1 and 2.2.2.
In Chapter 3, we motivate the problems that are addressed in this thesis. Then, we
provide formal problem statements for the following chapters.
In Chapter 4, we propose a convexification tool that constructs the convex hulls of
orthogonal disjunctive sets using convex extensions and disjunctive programming; see
28
Chapter 2 for an introduction to these techniques. We discuss the technical assumptions
under which this convexification tool can be used. In particular, we provide sufficient
conditions for establishing the convex extension property. The convexification tool is
then applied to obtain explicit convex hulls of various bilinear covering sets over the
nonnegative orthant. It is, in general, widely applicable to problems where variables do
not have upper bounds.
In Chapter 5, we study 0−1 mixed-integer bilinear covering sets to investigate
how bounds on the variables affect the derivation of cuts. We derive large families of
facet-defining inequalities via sequence-independent lifting techniques; see Chapter 2 for
an introduction to lifting techniques. We show that these sets have polyhedral structures
that are similar to those of certain single-node flow sets. In particular, we prove that the
facet-defining inequalities we develop generalize well-known lifted flow cover inequalities
from the integer programming literature.
In Chapter 6, we present a computational study that evaluates the strength of
lifted inequalities derived in Chapter 5. We first generalize the lifted inequalities of
Chapter 5 to a more general form of bilinear covering sets that include linear terms on
variables. This extension is necessary to account for the linear terms introduced during the
branch-and-bound process. We discuss implementations details and experimental results.
In Chapter 7, we summarize the main results of this thesis and conclude with
directions for future research.
29
CHAPTER 2CONVEX RELAXATIONS IN MILP AND MINLP
In this chapter, we describe methods to generate convex relaxations of MILPs and
MINLPs, focusing on the techniques that are most related to our work. In Section 2.1,
we describe how to build convex relaxations of nonconvex MINLP problems. Then, in
Section 2.2, we give an overview of how disjunctive programming and lifting techniques
can be used to generate improved formulations of MILPs. The tools described in
Sections 2.1 and 2.2 will be used in Chapters 4, 5, and 6.
2.1 Convexification Methods in MINLP
Constructing strong convex relaxations of nonconvex problems is a central problem in
developing branch-and-cut frameworks for nonconvex MINLPs. In this section, we describe
general convexification methods that are used in commercial global optimization solvers.
Note that, given a nonconvex problem of the form
min f(x)
s.t. gi(x) ≤ 0 ∀i ∈ M,
a simple convex relaxation can be obtained by relaxing each inequality into a convex
constraint and replacing f with a convex underestimator. In particular, if gi(x) is a convex
underestimator of gi(x) and f(x) is a convex underestimator of f(x), the relaxation
min f(x)
s.t. gi(x) ≤ 0 ∀i ∈ M.
is a convex optimization problem. Among convex underestimators, convex envelopes are
strongest. Therefore, the ability of constructing convex envelopes of nonlinear functions is
an essential ingredient in the derivation of strong convexifications of MINLPs.
30
2.1.1 Convex Envelopes and Convex Extensions
In the global optimization literature, convex envelopes have been developed for special
classes of functions over special polytopes. For detailed discussions, we refer the interested
reader to the books of Horst and Tuy [69] and Tawarmalani and Sahinidis [121].
First, we describe how convex envelopes of sums of functions can be obtained from
the sum of convex envelopes of the individual functions.
Theorem 2.1 (Al-Khayyal and Falk [4]). Let Q =∏r
j=1Qj be the cartesian product of r
compact nj-dimensional rectangles Qj for j = 1, . . . , r satisfying∑r
j=1 nj = n. Assume that
f : Q 7→ R is of the form f(x) =∑r
j=1 fj(xj), where fj : Qj 7→ R is lower semi-continuous
on Qj for j = 1, . . . , r. Then, the convex envelope of f on Q is obtained as the sum of the
convex envelopes of fj on Qj, i.e.,
convenv(f) =r∑
j=1
convenv fj(xj).
Next, we present two fundamental results developed by Falk and Hoffman [50] and
Horst [66].
Theorem 2.2. Let Q be a polytope with vertices v1, . . . , vk. Let f : Q 7→ R be a concave
function on Q. Then, the convex envelope convenv(f) of f can be computed as
convenv(f(x)) = minλ
k∑j=1
λjf(vj)
s.t.
k∑j=1
λjvj = x,
k∑j=1
λj = 1,
λj ≥ 0, j = 1, . . . , k.
The following result immediately follows.
Theorem 2.3. Let Q be a n-simplex generated by the vertices v0, v1, . . . , vn, and let
f : Q 7→ R be a concave function on Q. Then, the convex envelope of f is the affine
31
function
φ(x) = aTx+ b, where a ∈ Rn, b ∈ R,
that is uniquely determined by the system of linear equations
f(vi) = aTvi + b, for i = 0, 1, . . . , n.
It follows from Theorem 2.3 that it is especially easy to construct the convex
envelopes of univariate concave functions f : R 7→ R over an interval [l, u]. This is
because the graph of the convex envelope is simply the line segment connecting the points
(l, f(l)) and (u, f(u)).
Among the set of all multivariate functions, multilinear functions are of particular
importance as we will see in Section 2.1.2. Convex envelopes of multilinear functions were
studied by Crama [41] and Rikun [101]. We next give a formal definition of a multilinear
function.
Definition 2.1 (Multilinear). A function f(x1, . . . , xk) is said to be multilinear if, for each
i = 1, . . . , k, f(x1, . . . , xi, . . . , xk) is a linear function of the vector xi when the components
of the other k − 1 vectors are fixed to xj = xj for j 6= i.
Rikun [101] studied multilinear functions f(x) of x = (x1, . . . , xk) defined on the
cartesian product of polytopes where x ∈ Q =∏k
j=1Qj, xj ∈ Qj ⊂ Rnj for j = 1, . . . , k.
Definition 2.2 (Associated Affine Function). Let f(x) be a multilinear function defined
on∏k
j=1Rnj . For the function f(x) and any given point ξ = (ξ1, . . . , ξk) where ξj ∈ Rnj for
j = 1, . . . , k, the associated affine function fξ(x) is defined as:
fξ(x) =k∑
j=1
f(ξi, . . . , ξj−1, xj, ξj+1, . . . , ξk)− (k − 1)f(ξ). (2–1)
Rikun [101] showed that the convex envelope of a multilinear function over the
cartesian product of polytopes is polyhedral.
Theorem 2.4 (Rikun [101]). Let f(x1, . . . , xn) : Q 7→ Rn be a multilinear function defined
on the cartesian product of polytopes Q =∏k
j=1Qj where xj ∈ Qj ⊂ Rnj for j = 1, . . . , k.
32
Let ξ = (ξ1, . . . , ξk) be a vertex of Q, i.e., ξj ∈ vert(Qj) and the associated affine function
(2–1) satisfy
fξ(x) ≤ f(x), ∀x ∈ vert(Q).
Then, the affine function fξ(x) is an element of the convex envelope of f(x).
To facilitate the construction of convex envelopes of nonconvex functions, Tawarmalani
and Sahinidis [120] introduced the notion of convex extensions. This notion generalizes a
similar concept introduced by Crama [41].
Definition 2.3 (Convex Extensions). Let S be a convex set and X ⊆ S. A convex
extension of a function φ : X 7→ R over S is defined as a convex function ψ : S 7→ R such
that φ(x) = ψ(x) for all x ∈ X.
Note that convex extensions are neither always constructible nor unique. The
following result describes conditions under which a convex extension can be constructed.
Theorem 2.5 (Tawarmalani and Sahinidis [120]). A convex extension of a function
φ : X 7→ R over a convex set S ⊇ X can be constructed if and only if
φ(x) ≤ min
n∑j=1
λjφ(xj)
∣∣∣∣∣∣∣∣∣∣
∑nj=1 λjxj = x,
∑nj=1 λj = 1, xj ∈ X, ∀j = 1, . . . , n
λj ∈ [0, 1], ∀j = 1, . . . , n
for all x ∈ X.
Note that for complicated functions, finding convex envelopes might be difficult.
Next, we describe a general scheme that produces convex relaxations of factorable
functions.
2.1.2 Reformulation and Relaxation
Convexifications are often obtained in two steps: reformulation and relaxation. The
first step converts the original problem into an equivalent formulation that is easier to
study; the second step constructs a convex relaxation by relaxing nonconvex terms in the
reformulated problem.
33
First, we describe a general reformulation scheme for functions that are factorable; see
Definition 1.1. In fact, factorable functions can be reformulated by introducing auxiliary
variables using the recursive algorithms presented in Tawarmalani and Sahinidis [121].
To illustrate the idea, consider a factorable function f(x) given as the following sum of
product of univariate functions, i.e.,
f(x) =2∑
j=1
2∏
k=1
hjk(x).
In this case, we can reformulate f(x) by introducing auxiliary variables yj to represent
each term of the summation and auxiliary variables yjk to represent the factors of the
product respectively, i.e.,
f(x) = y,
y =2∑
j=1
yj, (2–2)
yj =2∏
k=1
yjk, ∀j = 1, 2 (2–3)
yjk = hjk(x), ∀j = 1, 2,∀k = 1, 2. (2–4)
Note that this reformulation lifts the original problem into a higher dimensional
space by introducing auxiliary variables. After the reformulation phase, we observe that
relaxation schemes are only needed for sums and products of two variables, appearing in
(2–2) and (2–3) respectively, as well as for univariate functions appearing in (2–4). For
all of the terms, convex relaxations can be constructed using factorable programming
techniques rooted in the work of McCormick [85].
Definition 2.4 (McCormick Relaxations [89]). The relaxations of a factorable function
that are formed via recursive application of rules for the relaxation of univariate composi-
tion, binary multiplication, and binary addition from convex and concave relaxations of the
univariate intrinsic functions, without the introduction of auxiliary variables, are said to be
McCormick Relaxations.
34
Since the sum of convex functions is convex, convex relaxations for the sum of two
functions can be easily constructed as follows.
Theorem 2.6 (Relaxation of Sums [89]). Let S ⊆ Rn be a nonempty convex set,
and g, g1, g2 : S 7→ R such that g(x) = g1(x) + g2(x). Let gu1 , g
o1 : S 7→ R be a
convex underestimator and concave overestimator of g1 on S, respectively. Similarly,
gu2 , go2 : S 7→ R be a convex underestimator and concave overestimator of g2 on S,
respectively. Then, gu, go : S 7→ R, defined as,
gu(x) = gu1 (x) + gu2 (x), go(x) = go1(x) + go2(x),
are a convex and concave relaxation of g(x) on S, respectively.
However, relaxation techniques for products of two functions is not straightforward
as shown in the following proposition. This results follows from the convex and concave
envelopes of a bilinear function developed by McCormick [85].
Theorem 2.7 (Relaxation of Products [89]). Let S ⊆ Rn be a nonempty convex set,
and g, g1, g2 : S 7→ R such that g(x) = g1(x)g2(x). Let gu1 , g
o1 : S 7→ R be a convex
underestimator and concave overestimator of g1 on S, respectively. Similarly, gu2 , go2 :
S 7→ R be a convex underestimator and concave overestimator of g2 on S, respectively.
Furthermore, let gL1 , gU1 , g
L2 , g
U2 ∈ R such that
gL1 ≤ g1(x) ≤ gU1 ∀x ∈ S and gL2 ≤ g2(x) ≤ gU2 ∀x ∈ S
Consider the following intermediate functions, α1, α2, β1, β2, γ1, γ2, δ1, δ2 : S 7→ R:
α1(x) = min{gL2 gu1 (x), gL2 go1(x)}, α2(x) = min{gL1 gu2 (x), gL1 go2(x)},
β1(x) = min{gU2 gu1 (x), gU2 go1(x)}, β2(x) = min{gU1 gu2 (x), gU1 go2(x)},
γ1(x) = max{gL2 gu1 (x), gL2 go1(x)}, α2(x) = max{gU1 gu2 (x), gU1 go2(x)},
δ1(x) = max{gU2 gu1 (x), gU2 go1(x)}, α2(x) = max{gL1 gu2 (x), gL1 go2(x)}.
35
Then, α1, α2, β1, and β2 are convex on S, while γ1, γ2, δ1, and δ2 are concave on S.
Moreover, gu, go : S 7→ R, defined as
gu(x) = max{α1(x) + α2(x)− gL1 g
L2 , β1(x) + β2(x)− gU1 g
U2
},
go(x) = min{γ1(x) + γ2(x)− gU1 g
L2 , δ1(x) + δ2(x)− gL1 g
U2
},
are convex and concave relaxations of g on S, respectively.
Al-Khayyal and Falk [4] prove that McCormick relaxation constructs the convex and
concave envelopes of bilinear terms, presented as follows.
Theorem 2.8 (Al-Khayyal and Falk [4]). Consider a bilinear term yiyj over the hypercube
H2 := [yli, yui ]× [ylj, y
uj ]. Then,
convenv(yiyj) = max{yliyj + yljyi − yliy
lj, y
ui yj + yuj yi − yui y
uj
}
and
concenv(yiyj) = min{yui yj + yljyi − yui y
lj, y
liyj + yuj yi − yliy
uj
}.
McCormick [86] showed that a tight relaxation of a composition of functions
h(g(x)) can be built using convex and concave envelopes as the underestimators and
overestimators of h(yg). Relaxation methods for multilinear functions over a hypercube
have been proposed by Rikun [101] and Ryoo and Sahinidis [104]. Different relaxation
schemes for the fractional functions are developed by Tawarmalani and Sahinidis [119]
and Tawarmalani et al. [114, 115]. For detailed specification of recursive reformulation
algorithms, we refer the interested reader to the book of Tawarmalani and Sahinidis [121].
Assuming that all variables are bounded, a univariate convex function f(xj) where
xj ∈ [xlj, x
uj ], is overestimated by the line connecting the points
(xlj, f(x
lj))and
(xuj , f(x
uj ))
while f(xj) is underestimated by the function itself. Hence, a convex outer-approximator
of any convex function can be constructed by combining these estimators.
If a univariate function f(xj) is convex and differentiable over xj ∈ [xlj, x
uj ], then for
any x ∈ [xlj, x
uj ], a valid linear inequality can be obtained using the gradient. For a given
36
gradient ∂f∂xj
(x) of f(xj) at x, the gradient inequality
y ≥ f(x) +∂f
∂xj
(x)(xj − x), (2–5)
is valid for all xj ∈ [xlj, x
uj ]. Therefore, we can build linear relaxations using outer-
approximations of differentiable univariate functions such as exp(x), log(x), sin(x), and
cos(x).
2.2 Cutting Plane Techniques for Mixed-Integer Linear Program (MILP)
For MILPs, we mentioned in Section 1.2.2 that LP relaxations are often used as
convexifications. In this section, we discuss techniques to improve LP relaxations of
MILPs. We consider mixed-integer linear programs of the form
(MILP )min cTx
s.t. x ∈ S
where I ⊆ {1, . . . , n} and
S :={x ∈ Z|I|
+ × Rn−|I|+
∣∣∣ Ax ≤ b}.
We first present a basic result about the convex hull of S.
Theorem 2.9 (Meyer [87]). The convex hull of S, where A ∈ Qm×n and b ∈ Qm, is a
polyhedron whose extreme points lie in S.
This result together with Theorem 1.2 implies that every MILP problem can be
reformulated as a linear program, provided that A and b are rational. This is particularly
interesting since LPs can be solved efficiently as we mentioned in Section 1.2.1. While the
linear program
min{cTx
∣∣∣ x ∈ conv(S)}
always has an optimal solution that is optimal for (MILP ), it is typically difficult to
obtain a full linear description of conv(S). Nevertheless, we are interested in finding
partial descriptions of conv(S). Studying the polyhedron conv(S) requires a good
37
understanding of what inequalities aix ≤ bi are most important in the description of
conv(S). This motivates the introduction of the following definitions.
Definition 2.5 (Valid Inequality). Let X ⊆ Rn. The inequality αTx ≤ δ is said to be valid
for X if it is satisfied by all points of X, i.e., αT x ≤ δ ∀x ∈ X.
Definition 2.6 (Face). If αTx ≤ δ is a valid inequality for a polyhedron Q, then
F = Q ∩{x ∈ Rn
∣∣∣ αTx = δ}
is said to be a face of Q. We also say that αTx ≤ δ represents or defines the face F .
In order for an inequality to be helpful in the description of a polyhedron, the face
it defines should be large. To measure the dimension of a polyhedron, we introduce the
following definitions
Definition 2.7 (Affine Independence). Vectors x1, . . . , xk in Rn are said to be affinely
independent if the unique solution to the system∑k
j=1 λjxj = 0,
∑kj=1 λj = 0 is λj = 0 for
all j = 1, . . . , k.
Definition 2.8 (Dimension). A polyhedron Q has dimension d, which we denote by
dim(Q) = d, if the maximum number of affinely independent points in Q is d+ 1.
Definition 2.9 (Facet). A face F of a polyhedron Q is said to be a facet of Q if dim(F ) =
dim(Q) − 1. A valid inequality αTx ≤ δ that induces a facet of Q is called a facet-defining
inequality for Q, or facet for short .
We mention that among all inequalities in the description of a full-dimensional
polyhedron, only those that define facets are necessary. We refer the interested reader to
Nemhauser and Wolsey [91] for a detailed exposition.
Proposition 2.1. Let Q be a full-dimensional polyhedron defined by (ai)Tx ≤ bi for
i ∈ M . Let MF be the subset of M containing the indices of facet-defining inequalities for
Q. Then,
Q ={x ∈ Rn
∣∣∣ (ai)Tx ≤ bi ∀i ∈ MF
}.
38
Therefore, when studying conv(S), it is sufficient to consider inequalities that are
facet-defining.
We will describe in Sections 2.2.1 and 2.2.2 techniques to construct valid and
facet-defining inequalities for MILPs. We note that, in practice, the question is not
only how to generate inequalities but also how to use them. In fact, the linear description
of conv(S) can have exponentially many inequalities. It is therefore typically impractical
to solve the corresponding linear programs. In order to overcome this difficulty, cutting
plane methods are typically used. The first cutting plane algorithm for solving MILPs was
described in 1958 by Gomory [57] for the case where |I| = n. This algorithm generalized
the more dedicated polyhedral approach devised by Danzig et al. [45] for the Traveling
Salesman Problem.
In cutting plane algorithms, we solve a sequence of linear programs that differ from
each other by the addition of one or more valid inequalities. More precisely, we first solve
the LP relaxation of (MILP ) to global optimality. The corresponding optimal solution x0
is typically fractional since the LP relaxation does not impose integrality on the variables.
We obtain a tightened formulation by adding inequalities to the LP relaxation.
Definition 2.10 (Cutting Plane). An inequality αTx ≤ δ where α ∈ Rn and δ ∈ R is
said to be a cutting plane for (MILP ), or cut for short, if it is valid for S and there exists
solution x0 of the LP relaxation such that αTx0 > δ.
It is clear that, for a cut αTx ≤ δ to improve the current LP relaxation of an MILP,
it must cut x0 off, i.e., αTx0 > δ. Given a fractional solution x0 of the LP relaxation,
the problem of finding such a violated cut is known as separation problem. It is typically
difficult to solve separation problems exactly since separation was shown to be as hard as
optimization; see Grotschel et al. [59]. Note that the proof relies on the ellipsoid algorithm
described earlier in Section 1.2.1. As a result, heuristics are often used for separation. If a
cut is found, it is added and the process is iterated. Otherwise, the process is terminated.
The basic structure of cutting plane algorithms is described in Figure 2-1. For detailed
39
textbook descriptions, we refer the interested readers to Nemhauser and Wolsey [91] and
Schrijver [108].
Input: Problem (MILP )Output: An optimal solution x∗
Initialization;i = 0, Q0 ← LP relaxation of (MILP );
Obtain x0 by solving the LP min{cTx | x ∈ Q0};while xi is fractional do
Separation;if there exists a cutting plane αT
i x ≤ δi separating xi from S thenQi+1 ← Qi ∩ {x | αT
i x ≤ δi};i ← i+ 1;
elseterminate
endObtain xi by solving the linear program min{cTx | x ∈ Qi};
endx∗ ← xi;
Figure 2-1. Cutting plane algorithm
Although the algorithm of Figure 2-1 can terminate without finding an integer
optimal solution for (MILP ), the formulation Qi obtained after the addition of cuts
provides a strengthened formulation for which branch-and-bound is likely to be more
efficient. In practice, there are many tradeoffs to consider between the running time of a
separation procedure and the quality of the cutting planes it produces when designing a
cutting plane algorithm.
2.2.1 Disjunctive Programming
In this section, we give an overview of disjunctive programming techniques and how
they can be used to generate strong cuts for MILPs. Disjunctive programming can be
succinctly described as the study of optimization problems defined over unions of sets,
typically polyhedra. Even when the sets are convex, their union is typically not. One of
the main focuses of disjunctive programming is to study the convex hull of such unions.
40
The foundations of disjunctive programming were laid by Balas in a technical report in
1974. This report was published 24 years later in Balas [17].
Disjunctive programming is directly applicable to MILPs since fixing integer variables
to all values they can take transforms these MILPs into disjunctive programs. As a
result, disjunctive programming techniques have been used to derive strong relaxations
and cutting planes for various problems; see Balas [15, 16]. In particular, Balas et al.
[20] implemented disjunctive programming techniques for mixed 0−1 programs in a
branch-and-cut framework. They specialize generic disjunctive programming techniques to
show how to generate lift-and-project cuts through the solution of a cut generation linear
program (CGLP), and develop strengthened disjunctive cuts.
Stubbs and Mehrotra [113] generalized the disjunctive programming techniques of
Balas et al. [20] to 0−1 mixed convex programming problems inside of a branch-and-cut
framework. Ceria and Soares [31] also provided algebraic representations and solution
procedures for disjunctive convex programming.
Next, we describe some important results in disjunctive programming. We limit our
presentation to union of polyhedra. We first review the basic concept of projection that
will be used to relate convex hulls of sets in the space of their original variables and their
higher dimensional representations obtained by disjunctive programming. We refer to
Balas [18] and Cornuejols [37] for more detailed discussions.
Definition 2.11 (Projection). Given a polyhedron Q ⊆ Rn × Rr, the projection of Q onto
the subspace of Rn defined by the x variables is defined as
projx
Q ={x ∈ Rn
∣∣∣ (x, y) ∈ Q for some y ∈ Rr}.
The projection of a polyhedron Q can be obtained using Fourier-Motzkin Elimination;
see Fourier [54]. This method recursively eliminates the variables yi one at a time, as
presented in the following proposition.
41
Proposition 2.2 (Fourier-Motzkin Elimination). Given a polyhedron
Q =
{(x, y) ∈ Rn × R
∣∣∣n∑
j=1
aijxj + biy ≤ di,∀i = 1, . . . ,m
},
the projection of Q onto x satisfies
projx
Q =
x ∈ Rn
∣∣∣∣∣∣∣
dk−∑n
j=1 akjxj
bk≤ dl−
∑nj=1 aljxj
bl, ∀k ∈ M−,∀l ∈ M+,
∑nj=1 aijxj ≤ di, ∀i ∈ M0
,
where M+ = {i ∈ M | bi > 0}, M− = {i ∈ M | bi < 0} and M0 = {i ∈ M | bi = 0}.The projection can be also obtained using the concept of projection cone as described
below.
Proposition 2.3 (Cornuejols [37]). Let
Q ={(x, y) ∈ Rn × Rr
∣∣∣ Ax+By ≤ d}.
Then,
projx
Q ={x ∈ Rn
∣∣∣ (uTA)x ≤ uTd, ∀u ∈ E}
where E is the set of extreme rays of the projection cone
C :={u ∈ Rr
∣∣∣ uTB = 0, u ≥ 0}.
Definition 2.12 (Disjunctive sets). Given polyhedra Qi ={x ∈ Rn
∣∣∣ Aix ≥ bi}
for i ∈ M ,
we define the disjunctive set⋃
i∈M Qi as
Q =
{x ∈ Rn
∣∣∣∨i∈M
(Aix ≥ bi)
}. (2–6)
Expression (2–6) is known as the disjunctive normal form of the disjunctive program.
Using operations described in Balas [17], the disjunctive set Q can also be expressed as
Q =
x ∈ Rn
∣∣∣ Ax ≥ b,∨
h∈Mj
(dh ≥ dh0),∀j = 1, . . . , t
, (2–7)
which is called the conjunctive normal form.
42
Balas [17] describes how to obtain the convex hull of a disjunctive set. We present
this result in the following theorem.
Theorem 2.10 (Balas [17]). Given polyhedra Qi ={x ∈ Rn
∣∣∣ Aix ≥ bi}
6= ∅, ∀i ∈ M ,
define
Q :=
(x, (yi, yi0)i∈M
)
∣∣∣∣∣∣∣∣∣∣∣∣∣
x−∑i∈M yi = 0,
Aiyi − biyi0 ≥ 0,
yi0 ≥ 0,∑
i∈M yi0 = 1
,
where (yi, yi0) ∈ Rn+1 for i ∈ M . Then,
QM := cl conv(⋃i∈M
Qi) = projx
Q.
Further,
1. if x∗ is an extreme point of QM , then(x, (yi, yi0)i∈M
)is an extreme point of Q where
x = x∗, (yk, yk0) = (x∗, 1) for some k ∈ M , and (yi, yi0) = (x∗, 1) for all i ∈ M \ {k}.2. if
(x, (yi, yi0)i∈M
)is an extreme point of Q, then yk = x = x∗ and yk0 = 1 for some
k ∈ M , and x∗ is an extreme point of QM .
Theorem 2.10 gives a description of the convex hull of⋃
i∈M Qi in a higher dimensional
space. In order to obtain the convex hull QM in the original space of variables of Qis, we
must project Q onto the x space. Theorem 2.11 describes how this projection is obtained.
This result follows from Proposition 2.3.
Theorem 2.11 (Balas [17]). projx(Q) = {x ∈ Rn | αx ≥ β, ∀(α, β) ∈ W0}, where
W0 ={(α, β) ∈ Rn+1
∣∣∣ α = uiAi, β ≤ uibi, for some ui ≥ 0,∀i ∈ M}.
The higher dimensional representation also allows the derivation of facets of QM as
described in the following theorem.
Theorem 2.12 (Balas [17]). Assume that QM is full-dimensional. The inequality αx ≥ β
defines a facet of QM if and only if (α, β) is an extreme ray of the cone W0.
43
Given a point x /∈ QM , it is often necessary to derive a disjunctive cut αx ≥ β valid
for QM that cuts off x. This problem is equivalent to choosing coefficients (α, β, u) in W0
that minimize αx− β. This gives rise to Problem (2–8) commonly known as cut generating
LP. Note that in (2–8) we added the normalization constraint,∑
i∈M eTui = 1, to make
the problem bounded:
min αx− β
α = uiAi ∀i ∈ M,
β ≤ uib ∀i ∈ M, (2–8)
ui ≥ 0 ∀i ∈ M,∑i∈M
eTui = 1.
A disjunctive set is called facial if every inequality in (2–7) defines a face of Q, the
polyhedron defined by the constraints Ax ≥ b. An interesting feature of facial disjunctive
programs is that they can be sequentially convexified as described next.
Theorem 2.13 (Balas [17]). Let
D :=
x ∈ Rn
∣∣∣ Ax ≥ b,∨
h∈Mj
(dhx ≥ dh0), j = 1, . . . , t
,
where |Mj| ≥ 1 for j = 1, . . . , t and D is facial. Define
Q0(= Q) :={x ∈ Rn
∣∣∣ Ax ≥ b},
and for j = 1, . . . , t,
Qj := conv
Qj−1
⋂x
∣∣∣∨
h∈Mj
(dhx ≥ dh0)
. (2–9)
Then, Qt = cl conv(D).
Theorem 2.13 shows that, in some cases, it is sufficient to consider the disjunctions
sequentially rather than simultaneously to obtain the convex hulls.
44
To illustrate that disjunctive programming techniques can be helpful in creating
good convexification in integer programming, we describe its application to 0−1 integer
programming. A thorough description including relations to Lovasz and Schrijver [82]
and Sherali and Adams [110] is given in Balas et al. [20]. This variant of disjunctive
programming is commonly referred to as lift-and-project ; see Balas et al. [20]. For each
variable xj for j = 1, . . . , n, the current formulation is lifted into a higher dimensional
space Rn+p+q where it is tightened. Then, this strengthened formulation is projected back
onto the original space Rn+p, thus defining an improved formulation for S. After the last
variable is considered, the convex hull is obtained.
More precisely, consider the problem
min cTx
(BMILP ) s.t. Ax ≥ b,
xj ∈ {0, 1} ∀j = 1, . . . , n,
xj ∈ R+ ∀j = n+ 1, . . . , n+ r,
where integer variables can only take the values 0 or 1. We define
Q :={x ∈ Rn+r
+
∣∣∣ Ax ≥ b}
and denote the set of feasible solutions of (BMILP ) by
S :={x ∈ {0, 1}n × Rr
+
∣∣∣ Ax ≥ b}.
We assume that Ax ≥ b are obtained by adding −xj ≥ −1 for j = 1, . . . , n to Ax ≥ b
and that Ax ≥ b does not include xj ≥ 0 for j = 1, . . . , n. Clearly, the set S can be
reformulated as
S :=
x ∈ Rn+r
+
∣∣∣∣∣∣∣Ax ≥ b,
(−xj ≥ 0) ∨ (xj ≥ 1) ∀j = 1, . . . , n
,
45
which shows its relation to disjunctive programming. Since this problem is facial, its
convex hull can be obtained using Theorem 2.13. Particularly in this case, the jth step
(2–9) can be obtained as
Qj = projx
(x, x0, x1, y0, y1) ∈ R3n+ × R2
+
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
Aj−1x0 ≥ bj−1y0,
−x0j ≥ 0,
Aj−1x1 ≥ bj−1y1,
x1j ≥ y1,
x0 + x1 = x,
y0 + y1 = 1,
x, x0, x1, y0, y1 ≥ 0
.
where Qj−1 = {Aj−1x ≥ bj−1}. Denote the jth unit vector by ej. Using the projection cone
approach described in Proposition 2.3, we obtain that Qj is defined by inequalities αx ≥ β
where (α, β, u, u0, v, v0) are feasible solution to
α −uAj−1 +u0ej ≥ 0,
α −vAj−1 +v0ej ≥ 0,
β −ubj−1 ≤ 0, (2–10)
β −vbj−1 −v0 ≤ 0,
u, u0, v, v0 ≥ 0,
which is an expression of Theorem 2.11. The inequality αx ≥ β is called a lift-and-project
inequality. Note that lift-and-project inequalities are special type of split inequalities [36],
derived from the split disjunction xj ≤ 0 or xj ≥ 1. For details, see Balas et al. [20, 21].
2.2.2 Lifting
In this section, we describe a technique known as lifting and review how it has been
used to generate strong valid inequalities for MILPs. Deriving facet-defining inequalities
for the convex hull of feasible solutions to a MILP with many variables is typically
difficult. However, when a subset of variables is fixed to some values such as their lower
46
or upper bounds, it might be easier to derive a strong valid inequality. We refer to a
nontrivial inequality valid for a restricted set as a seed inequality. Lifting is the process
of constructing progressively, from a seed inequality valid for a lower dimensional set, an
inequality valid for a higher dimensional set. Gomory [58] first introduced the concept of
lifting in the context of the group problem. The technique was refined by Padberg [95] and
Wolsey [130]; see also Balas [14], Hammer et al. [63], Padberg [96], Wolsey [131], Zemel
[138], and Balas and Zemel [23].
Lifting is generally performed sequentially. Crowder et al. [42] and Gu et al. [60]
successfully used sequential lifting in a branch-and-cut framework for solving 0−1 integer
programs with cover inequalities. For 0−1 integer programs, Wolsey [132] proved that, if
the lifting function is superadditive, lifting coefficients are independent of the lifting order;
see Section 2.2.2.2. Gu et al. [62] applied sequence-independent lifting to mixed-integer
programs. Marchand and Wolsey [83] also used superadditive lifting for 0−1 knapsack
problems with a single continuous variable and Richard et al. [98] developed a general
lifting theory for continuous variables. Recently, lifting has also been used to obtain
inequalities for special-purpose global optimization problems; see de Farias et al. [46],
Vandenbussche and Nemhauser [128], and Atamturk and Narayanan [10]. A general
lifting theory for nonlinear programming is described in Richard and Tawarmalani [100].
However, the application of lifting techniques in MINLPs remains limited.
2.2.2.1 Sequential lifting
Although lifting can be used for general MILPs and for nonlinear programs, we
describe it only for the 0−1 knapsack polytope:
K =
{x ∈ {0, 1}n
∣∣∣∑j∈N
ajxj ≤ d
}
where |N | = n since the ideas extend to more general settings. Let N ′ ⊆ N and
v ∈ {0, 1}n. To represent the restricted set where some of the variables xj are fixed to 0 or
47
1, we define
K(N ′,v) ={x ∈ K
∣∣∣ xj = vj ∀j ∈ N ′}.
By selecting N ′ to be a larger and larger subset of N , we can change conv(K(N ′,v)) into
a polyhedron whose dimension is as small as we want. We note that one might think of
fixing the variables to some values between lower and upper bounds. In this case, however,
Atamturk [8] and Richard et al. [98] show that it is typically not possible to perform
lifting.
Often, it is easy to find a facet-defining inequality for low-dimensional polyhedra.
Assume therefore that∑
j∈N\N ′αjxj ≤ δ (2–11)
is a valid inequality for K(N ′,v). Assume without loss of generality that N ′ = {1, . . . , p}where p ≤ n. Taking (2–11) as the seed inequality, we convert (2–11) into an inequality
globally valid for conv(S) by lifting variables xj that were fixed to vj for j ∈ N ′. We can
perform lifting one variable xj at a time in some predefined order such as j = 1, . . . , p.
This approach is known as sequential lifting and is the most commonly used form of
lifting. We mention however that it can sometimes be beneficial to lift several variables
xj for some j ∈ N ′ at the same time; see Zemel [138] and Gu et al. [60]. This variant of
lifting is called simultaneous lifting.
Assume that the variables x1, . . . , xi−1 have already been lifted and
i−1∑j=1
αj(xj − vj) +∑
j∈N\N ′αjxj ≤ δ (2–12)
is valid for K(N ′ \ {1, . . . , i − 1},v). Lifting the variable xi for i ∈ N ′ in the inequality
(2–12) amounts to deriving a coefficient αi for which the lifted inequality
αi(xi − vi) +i−1∑j=1
αj(xj − vj) +∑
j∈N\N ′αjxj ≤ δ (2–13)
48
is valid for K(N ′ \ {1, . . . , i},v). To find αi, we define the lifting function
Φi(a) = δ− maxi−1∑j=1
αj(xj − vj) +∑
j∈N\N ′αjxj
s.t.
i−1∑j=1
aj(xj − vj) +∑
j∈N\N ′ajxj ≤ d− a (2–14)
xj ∈ {0, 1} ∀ j ∈ {1, . . . , i− 1} ∪N \N ′
associated with the inequality (2–12).
Theorem 2.14 (Wolsey [130]). Assume that the optimization problem defining Φ(ai) is
feasible. Inequality (2–13) is valid for K(N ′ \ {1, . . . , i},v) if
αi ≤
Φi(ai) if vi = 0,
−Φi(−ai) if vi = 1.
Moreover, if
1. (2–12) defines a face of conv(K(N ′ \ {1, . . . , i− 1},v)) of dimension k, and
2. αi = Φi(ai) when vi = 0 or αi = −Φi(−ai) when vi = 1, then (2–13) defines a face of
conv(K(N ′ \ {1, . . . , i},v)) of dimension at least k + 1.
Theorem 2.14 describes how to sequentially lift binary variables inside of 0−1
knapsack constraints. Lifting for general integer variables was used in Ceria et al. [30].
Lifting for continuous variables was first used by Marchand and Wolsey [83] where the
authors lift a single continuous variable without upper bounds inside a 0−1 mixed-integer
knapsack set. Richard et al. [98] proposed a general theory for the lifting of multiple
continuous variables with bounds.
We observe in Theorem 2.14 that a different lifting function Φi(a) must be computed
to determine the lifting coefficient of each lifted variable. In general, computing the lifting
function (2–14) even at a single point can be computationally time-consuming. Some of
these difficulties disappear when the lifting function is well-structured.
49
2.2.2.2 Sequence-independent lifting
To improve computational efficiency of sequential lifting, Wolsey [132] introduced the
concept of sequence-independent lifting. This method reduces the computational burden
associated with lifting by identifying conditions under which the lifting function does not
change during the various stages of lifting.
Definition 2.13 (Superadditive). Let Φ : W ⊆ R 7→ R. The function Φ is superadditive
over W if
Φ(w1) + Φ(w2) ≤ Φ(w1 + w2) for all w1, w2, w1 + w2 ∈W.
For 0−1 integer programs, Wolsey [132] proved that, if a lifting function is superadditive,
lifting coefficients are independent of the lifting order. Gu et al. [62] generalized the
concept of sequence-independent lifting to 0−1 mixed-integer programs. Atamturk [8]
generalized these results to general mixed-integer programs.
Theorem 2.15 (Gu et al. [62]). If the lifting function Φi(w) is superadditive over R, then
Φi(w) = Φi+1(w).
A superadditive lifting function is useful for deriving lifted inequalities efficiently.
Unfortunately, lifting functions are not always superadditive. For these situations, Gu
et al. [62] proposed to use superadditive approximations of the lifting function. Further,
they identify validity, dominance, and maximality to be common properties of good
superadditive approximations. Sequence-independent lifting has been used to derive strong
valid inequalities for various problems; see Marchand and Wolsey [83], Gu et al. [61],
Atamturk and Rajan [11], and Atamturk [7].
To lift multiple bounded continuous variables, Richard et al. [99] introduced the
concept of superlinear lifting that is a natural counterpart to superadditive lifting for
integer variables. We refer the interested reader to Richard et al. [99].
50
CHAPTER 3MOTIVATION AND RESEARCH STATEMENTS
3.1 Motivation
When comparing state-of-the-art solvers, it can be readily observed that solving
MINLPs to globally optimality requires more computational time than solving MILPs.
This is because traditional convexification methods do not always construct strong convex
relaxations. As discussed in Chapter 2, currently prevalent convexification techniques
derive convex relaxations of nonconvex MINLP problems by relaxing inequalities of the
form g(x) ≥ r with g(x) ≥ r, where g(x) is a concave overestimator of the function g(x).
Tawarmalani and Sahinidis [121] discuss how tight overestimators for various kinds of
functions can be constructed to produce such relaxations. However, the derived relaxation
can be weak because these methods do not use right-hand-side information during the
construction of the convex relaxations.
As an illustrative example, consider the simple set S defined as
S ={(x, y, z) ∈ R3
+
∣∣∣ xy + z ≥ r},
where r > 0. It can be easily seen that S is not a convex set since both (√r,√r, 0) and
(0, 0, r) belong to S while their convex combination with a weight of 12on each point does
not. The feasible region of S for r = 2 is represented in Figure 3-1 (a) where it can be
observed to be nonconvex.
First, we consider the set S where there are no upper bounds on the variables x, y,
and z. In this case, we can verify that the concave envelope of g(x, y, z) = xy + z is infinite
if both x and y have non-zero values. As a result, the convex relaxation of S obtained by
replacing g(x, y, z) ≥ r with convenv(g(x, y, z)) ≥ r is given by
R(S) ={(x, y, z) ∈ R3
+
∣∣∣ x > 0, y > 0}∪{(x, y, z) ∈ R3
+
∣∣∣ z ≥ r, xy = 0}.
51
This set is not closed and is therefore unlikely to be used as a relaxation. Its closure can
be observed to be R3+. Therefore, the above relaxation scheme corresponds in essence to
dropping the original constraint in the relaxed problem. We observe in Figure 3-1 (b) that
this is clearly not the best convex relaxation. In fact, we will establish in Chapter 4 that
the convex hull of S can be expressed as
conv(S) =
{(x, y, z) ∈ R3
+
∣∣∣√
xy
r+
z
r≥ 1
}.
Note that in the above expression, the right-hand-side plays a different role in each term.
It therefore cannot be naturally obtained as g(x, y, z) ≥ r.
(a) S (b) conv(S) (c) S1 and S2
Figure 3-1. Geometric illustration of S, conv(S), S1 and S2
Next, consider the set SB where the variables have upper bounds and assume r = 2
for simplicity , i.e.,
SB =
(x, y, z) ∈ R3
+
∣∣∣∣∣∣∣xy + z ≥ 2,
x ≤ 4, y ≤ 4, z ≤ 3
.
52
Considering bounds on variables is typically necessary when constructing convexifications
inside of the branch-and-bound tree, where the feasible region has been partitioned
into smaller subsets by branching on variables. In this case, since the concave envelope
of g(x, y, z) is polyhedral, we obtain the following convex relaxation using factorable
relaxation techniques:
FR(SB) =
(x, y, z) ∈ R3
+
∣∣∣∣∣∣∣∣∣∣
3(x+ y) + 8z − 24 ≤ 0,
4x+ z ≥ 2, 4y + z ≥ 2,
x ≤ 4, y ≤ 4, z ≤ 3
.
This relaxation is not the convex hull of SB since conv(SB) is not polyhedral. Therefore,
even in the case of bounded variables, better relaxations than those currently used can be
found.
It is common to derive convexifications from single constraints of the problem
rather than by considering multiple constraints simultaneously. As we discussed in
Section 2.1, multilinear and bilinear constraints are important in the derivation of tight
convexifications as they are common inside of nonlinear programs and their factorable
reformulations. Therefore, in this thesis, we study bilinear covering sets defined by a single
nonlinear inequality of the form∑j∈N
ajxjyj ≥ d, (3–1)
where aj > 0, x ∈ X, and y ∈ Y . This set arises in many practical problems and
theoretical studies. In particular, for the case where X ⊆ Zn and Y ⊆ Zn, bilinear
covering constraints (3–1) can be found in Harjunkoski et al. [64] as we will discuss in
Chapter 4. For the case where X ⊆ {0, 1}n and Y ⊆ [0, 1]n, (3–1) is shown in Chapter 5 to
yield a relaxation of certain single-node flow models that have been studied in the integer
programming literature.
53
3.2 Problem Statements
The premise of this thesis is that it is possible to build tighter convex relaxations of
MINLPs by considering right-hand-side information. Constructing tight convexifications
of MINLPs is an important practical problem because these relaxations can improve the
efficiency of branch-and-bound methods for MINLPs. This would, in turn, help increase
the performance of current state-of-the-art solvers. Although considering right-hand-side is
not common in MINLP, techniques for improving relaxations that use right-hand-side are
common in MILP where they have been shown to help significantly improve bounds and
to reduce the size of the branch-and-bound trees. In this thesis, we are therefore interested
in investigating how such methods can be translated to MINLP.
3.2.1 Strong Valid Inequalities for Orthogonal Disjunctions and BilinearCovering Sets
We can see in Figure 3-1 (c) that the restricted orthogonal subsets,
S1 := S ∩ {z = 0} = {(x, y, 0) | xy ≥ r}
and
S2 := S ∩ {x = 0, y = 0} = {(0, 0, z) | z ≥ r}
completely determine the convex hull of S. Based on this observation, we investigate how
to obtain a closed form expression for conv(S) and many other similar sets in Chapter 4.
In particular, using disjunctive programming, we develop a new convexification tool for
nonlinear sets. Our tool characterizes the convex hull of orthogonal disjunctive sets in
closed-form under some technical conditions. The results differ from current approaches
in that the resulting expressions do not contain exogenous variables. We then show that,
similar to Figure 3-1 (c), the convex hull of many nonlinear sets is completely dictated by
their restrictions over orthogonal subspaces. We provide sufficient conditions to check this
particular type of convex extensions property. We conclude by illustrating how our tools
can be used to obtain the convex hulls of certain nonlinear sets.
54
The convexification tool we develop is useful since it provides a closed-form expression
of convex hulls of many nonlinear sets. However, it is not completely general as it typically
requires that variables are not bounded above.
3.2.2 Lifted Inequalities for 0-1 Mixed-Integer Bilinear Covering Sets withBounded Variables
In Chapters 5 and 6, we study 0−1 Mixed-Integer Bilinear Covering Sets since they
are one of the simple mixed-integer nonlinear sets that have upper bounds on both integer
and continuous variables. We investigate in Chapter 5 how to apply lifting techniques for
these sets. Using sequence-independent lifting, we derive strong valid inequalities for the
convex hull of these sets. We also show that the bilinear covering sets are similar to the
single-node flow models with respect to their polyhedral structure. As a result, we prove
that our results yield generalizations of the classical lifted flow cover inequalities in integer
programming. We then test the practical impacts of these results through a computational
study in Chapter 6.
55
CHAPTER 4STRONG VALID INEQUALITIES FOR ORTHOGONAL DISJUNCTIONS AND
BILINEAR COVERING SETS 1
4.1 Introduction
In Chapter 3, when considering the set
S ={(x, y, z) ∈ R3
+
∣∣∣ xy + z ≥ r},
we discussed that traditional techniques for relaxing the inequality of S would simply
drop the constraint to produce R3+, a relaxation that does not consider right-hand-side
information. In this chapter, we propose a scheme that produces tighter convex relaxations
by considering the right-hand-side of the constraint. In particular, for the set S presented
above, our scheme produces the following convex relaxation
RS =
{(x, y, z) ∈ R3
+
∣∣∣√
xy
r+
z
r≥ 1
},
which is a much tighter approximation than R3+. Considering this simple example, we
can make three observations. First, the relaxation, RS, is nonlinear. This is in contrast
to current implementations of nonlinear branch-and-bound that typically construct linear
relaxations for multivariate terms; see Tawarmalani and Sahinidis [123]. Second, the form
of the nonlinear cut is surprising as it applies different functions to the different terms
of the initial inequality. For S, the first term is modified using a square-root after being
divided by r, while the second is simply divided by r. Third, RS is not only a convex
relaxation of S, but it is in fact (as will be shown later) the convex hull of S. These
observations generalize to many polynomial covering sets; see Tawarmalani et al. [117].
Surprisingly, the convex hull for these sets can be expressed in a simple form without
1 The material of this chapter is based on [118].
56
introducing new variables while developing the concave envelope of the corresponding
polynomial can be much harder.
The convex hull representation for bilinear covering sets arises from a general theory
of orthogonal disjunctions that we develop in this chapter. To provide an example,
consider the set S again. We will show that the convex hull of S is determined by the
points of S that either belong to the half-plane (x, y, 0), where (x, y) ∈ R2+ or to the
half-line (0, 0, z), where z ∈ R+. In other words, the set S satisfies the convex extension
property (see Tawarmalani and Sahinidis [120]) and the important subsets of S belong
to orthogonal subspaces. Because the convex extension property holds, it is natural to
expect that one could build a higher dimensional description of the convex hull of S using
disjunctive programming arguments; see Rockafellar [102] and Balas [17]. Disjunctive
programming has been used to develop tight relaxations and cutting planes in integer,
nonlinear, and robust optimization; see [9, 13, 22, 31, 107, 111, 113, 119]. Unlike our
result, the literature on disjunctive programming formulations mostly focuses on naturally
disjunctive sets. Cutting planes based on disjunctive formulations, are typically linear
and derived by solving separation problems over extended formulations; see Cornuejols
and Lemarechal [38]. One interesting observation in this chapter is that, as long as the
disjunctive terms are orthogonal and a few technical conditions are satisfied, there is no
need to introduce additional variables. Furthermore, the convex hull of S can be easily
expressed in closed-form using the representations of the convex hull of S in each of the
two orthogonal subspaces, namely√
xyr
≥ 1 and zr≥ 1. We establish a much more
general set of conditions under which the argument evoked above is correct, allowing the
use of both right-hand-side and left-hand-side information in the derivation of convex
relaxations for nonlinear programming. Our results rely on the ability to prove that a
convex extension property holds over orthogonal disjunctions and the ability to derive
closed form expressions of convex hulls (possibly in a higher dimensional space) over
each of the subspaces. Our techniques are applicable to large families of problems and
57
yield convex approximations that are stronger than those currently used in nonlinear
branch-and-bound solvers; see Tawarmalani et al. [117].
In Section 4.2, we describe a tool to obtain the convex hull of orthogonal disjunctive
sets. The result can be invoked under certain technical conditions. We provide tools to
verify these assumptions. We also provide counterexamples to show the need for the
assumptions. The split cut for mixed-integer polyhedral sets is shown to be a special case
of our general convexification tool. In Section 4.3, we illustrate the application of the
tool in nonlinear integer programming by convexifying bilinear pure/mixed-integer sets.
Nonconvex inequalities in continuous variables are not naturally disjunctive. For such
inequalities, we establish sufficient conditions under which the convex extension property
holds over the non-negative orthant. We show that these sufficient conditions are satisfied
by continuous bilinear covering sets and develop their convex hulls over the non-negative
orthant. We summarize the contributions of this work in Section 4.4 and conclude with
remarks and directions for future research.
4.2 Convexification of Orthogonal Disjunctive Sets
In this section, we first introduce and prove a general result that exposes the
closed-form convex hull inequality description of the disjunctive union of a finite number
of sets defined over subspaces that are orthogonal to each other. This result also applies
to non-disjunctive sets provided that their convex hulls are entirely defined by their
restrictions over a finite number of orthogonal subspaces. We then illustrate the utility of
this result in finding convex hull descriptions. We discuss the need for certain seemingly
technical assumptions in the statement of the result. In particular, we discuss each one
of the four assumptions of the theorem and describe, with examples, situations where
they are satisfied. For some of the assumptions, we establish sufficient conditions that
are simple to verify. We then show that the cuts that yield the convex hull, under the
specified technical conditions, continue to produce valid inequalities even when some
of the conditions are not satisfied. Throughout, we demonstrate the generality and
58
applicability of our convexification result by deriving new convex hull descriptions of
various continuous, mixed, and pure-integer bilinear covering sets, and providing an
alternate derivation of the classic split cut in mixed-integer programming.
In the following, given a set S, we represent its convex hull by conv(S), its closure by
cl(S), and its projection on the space of z variables by projz S. For a closed convex set,
S, we denote the set of its recession directions by 0+(S). When we display equations, we
sometimes write min
f(z)
g(z)
to denote min{f(z), g(z)}.
While convexifying a given set S, we will often consider its orthogonal restrictions,
that we will denote as Si for i ∈ {1, . . . , n} and define as Si = {z | z = (z1, . . . , zn) ∈S, zj = 0 ∀j 6= i}. To simplify the forthcoming discussions and proofs, we next introduce
notations that help in converting descriptions of points in the sets Si, for i ∈ {1, . . . , n},to descriptions of points in S and vice-versa. Let (z1, . . . , zn) ∈ R
∑ni=1 di , zi ∈ Rdi ,
N = {1, . . . , n}, and A = {i1, . . . , ip} ⊆ N . Then, zA denotes (zi)i∈A ∈ R∑
i∈A di , i.e.,
A provides the index set of subspaces into which z is projected. Conversely, zA may be
injected into the original space by setting the missing coordinates to zero. To succinctly
express this operation, we introduce the following notation. For each j ∈ {1, . . . ,m}, letzj = (zji )
ni=1, where zji ∈ Rdji . Then, given aj ∈ R
∑pk=1 d
jik , we denote by L (A; a1, . . . , am)
the vector (z1, . . . , zm), where for all j, zjA = aj and zjN\A = 0. When A is a singleton {i},we write it as i itself. For each j, the above notation injects aj into the space of the zj
variables by setting the coordinates indexed by A according to their corresponding values
in aj, and the remaining coordinates to zero. For example, L ({1, 3}, (z1, z3), (u1, u3))
equals (z1, 0, z3, 0, . . . , 0;u1, 0, u3, 0, . . . , 0), where the semi-colon is used to demarcate the
z and u vectors. Throughout the text, we will mostly use the following two expressions:
L(i, (zi, ui)) to denote the vector (0, 0, . . . , 0, 0, zi, ui, 0, 0, . . . , 0, 0) and L(i, zi, ui) to denote
the vector (0, . . . , 0, zi, 0, . . . , 0; 0, . . . , 0, ui, 0, . . . , 0), where the semi-colon delineates the
vector z = (z1, . . . , zn) from the vector u = (u1, . . . , un).
59
We next introduce a notation to express a generic set described via inequalities.
Consider functions tj : R∑n
i=1 di × R∑n
i=1 d′i 7→ R for j ∈ J , vk : R
∑ni=1 di × R
∑ni=1 d
′i 7→ R for
k ∈ K and wl : R∑n
i=1 di ×R∑n
i=1 d′i 7→ R for l ∈ L. Let (z, u) ∈ R
∑ni=1 di ×R
∑ni=1 d
′i . Then, we
denote A(tJ , vK , wL
)by the following set:
A(tJ , vK , wL
):=
(z, u)
∣∣∣∣∣∣∣∣∣∣
tj(z, u) ≥ 1,∀j ∈ J,
vk(z, u) ≥ −1, ∀k ∈ K,
wl(z, u) ≥ 0,∀l ∈ L
,
where J , K, and L are the index sets of inequalities with 1, 0, and −1 right-hand-sides
respectively. Note that there is no loss of generality in assuming that the right-hand-sides
are 1, 0, and −1 since the defining functions can be scaled by a positive multiplier to
satisfy this condition. We will often be interested in the set where the right-hand-sides of
the above inequalities are replaced with 0. This set is denoted by C(tJ , vK , wL
)and is
defined formally as follows:
C(tJ , vK , wL
)=
(z, u)
∣∣∣∣∣∣∣∣∣∣
|tj(z, u) ≥ 0,∀j ∈ J,
vk(z, u) ≥ 0,∀k ∈ K,
wl(z, u) ≥ 0,∀l ∈ L
.
Among alternative inequality descriptions of sets, we will see that it is often beneficial
to choose those that involve positively homogeneous functions. We present this concept in
the following definition; see §4 in [102] for details.
Definition 4.1. A function f(z) : Rn 7→ [−∞,∞] is said to be positively homogeneous if,
for λ > 0, f(λz) = λf(z).
We now describe our main convexification result.
Theorem 4.1. Let S ⊆ R∑
i di and let the points z of S be written as z = (z1, . . . , zi, . . . , zn) ∈S, where zi ∈ Rdi. For i ∈ N = {1, . . . , n}, let Si ⊆ S. Assume that:
(A1) if (z1, . . . , zi, . . . , zn) ∈ Si, then zj = 0 for ∀j 6= i,
(A2) conv(S) = conv(∪ni=1Si),
60
(A3) there exists, for i ∈ N , positively-homogeneous functions tji for j ∈ Ji, vki for
k ∈ Ki, and wli for l ∈ Li such that conv(Si) ⊆ projz Ai ⊆ cl(conv(Si)), where
2
Ai ={L(i, (zi, ui))
∣∣∣ (zi, ui) ∈ A(tJii , v
Kii , wLi
i
)}, (4–1)
(A4) projz Ci where Ci ={L(i, (zi, ui))
∣∣∣ (zi, ui) ∈ C(tJii , vKii , wLi
i )}, is a subset of the
recession cone of cl conv (⋃n
i=1 Si), i.e., for all i,
projz
Ci ⊆ 0+
(cl conv
(n⋃
i=1
Si
)).
Then, conv(S) ⊆ projz X ⊆ cl conv(S) where3
X =
(z, u)
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∑i∈N
tjii (zi, ui) ≥ 1, ∀(ji)i∈N ∈∏i∈N
Ji
∑i∈I
vkii (zi, ui) ≥ −1, ∀I ⊆ N, ∀(ki)i∈I ∈∏i∈I
Ki
tjii (zi, ui) + vkii (zi, ui) ≥ 0, ∀i ∈ N, ∀ji ∈ Ji,∀ki ∈ Ki
tjii (zi, ui) ≥ 0, ∀i ∈ N, ∀ji ∈ Ji
wlii (zi, ui) ≥ 0, ∀i ∈ N, ∀li ∈ Li
. (4–2)
Before proving Theorem 4.1, we briefly comment on its assumptions, its practical
importance, and its applicability. In Assumption (A2), we impose that any point in
S can be expressed as a convex combination of points in some of the sets Si. This
implies that only the subsets Si, for i = 1, . . . , n are needed when computing the
convex hull of S. In Assumption (A1), we require that these subsets belong to linear
subspaces that are orthogonal to each other. In Assumption (A3), we require that
an inequality description of the convex hull of each one of the sets Si is known. Note
that this inequality description might make use of an extended formulation (using the
2 As defined above, L (i, (zi, ui)) denotes (0, 0, . . . , 0, 0, zi, ui, 0, 0, . . . , 0, 0).
3 Here, and onwards,∏
i∈N Ji denotes J1 × · · · × Jn.
61
additional variables ui). Note also that in Theorem 4.1, we require that all inequalities
are defined using positively-homogeneous functions. We will show later that weaker
assumptions are sufficient to establish the validity of the cuts derived in Theorem 4.1
and that positive-homogeneity guarantees that the inequalities produced are strong. In
Assumption (A4), we impose, in essence, that the recession directions of each one of the
sets Ai are also recession directions for the closure convex hull of the union of the sets Si.
Under these four assumptions, we show that an inequality description of the convex
hull of S can be obtained by combining in a systematic way the inequalities arising in the
convex hull descriptions of the subsets Si, for i = 1, . . . , n. Note however that, for reasons
that will be described later, this inequality description might describe a superset of the
desired convex hull. However, the superset will never be larger than the closure convex
hull of S, which is sufficient for all practical purposes. This result bears some resemblance
to the work of Balas et al. [19] where the authors derive a closed-form representation
of the convex hull of certain orthogonal bounded linear polytopes using specialized
arguments. Theorem 4.1 generalizes this result as it allows the convexification of nonlinear
and possibly unbounded orthogonal disjunctive sets and therefore extends its applicability
to global optimization.
To prove Theorem 4.1, we introduce some notation. For T ⊆ N and λT ∈ R+, we
define:
RT (λT ) =
(zT , uT )
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
∑i∈T
tjii (zi, ui) ≥ λT , ∀(ji)i∈T ∈∏i∈T
Ji
∑i∈I
vkii (zi, ui) ≥ −λT , ∀I ⊆ T, ∀(ki)i∈I ∈∏i∈I
Ki
tjii (zi, ui) + vkii (zi, ui) ≥ 0, ∀i ∈ T, ∀ji ∈ Ji,∀ki ∈ Ki
tjii (zi, ui) ≥ 0, ∀i ∈ T, ji ∈ Ji
wlii (zi, ui) ≥ 0, ∀i ∈ T, ∀li ∈ Li
.
62
In particular, note that RN(1) = X. Whenever T is a singleton, say {i}, we denote
R{i}(λ{i}
)as Ri (λi). Further, we let
Q =
(λ, z, u)
∣∣∣∣∣∣∣∣∣∣∣
λi ≥ 0, ∀i ∈ N
(zi, ui) ∈ Ri (λi) , ∀i ∈ Nn∑
i=1
λi = λ1,...,n = 1
. (4–3)
The proof of Theorem 4.1 will be carried out in two steps: (i) Lemma 4.1, which
exploits the disjunctive structure of the convex hull of S implied by Assumption (A2) to
construct a higher-dimensional representation of conv(S), see set Q defined in (4–3); and
(ii) Lemma 4.2, which projects this higher-dimensional representation to the space of the
original variables, see set X defined in (4–2). We now carry out the first step of the proof
in Lemma 4.1. In particular, we use Assumptions (A2) and (A3) to identify a set Q whose
projection in the z space is included in cl conv(S) and includes conv(S). The subsequent
lemma will then project out the λ variables added in the definition of Q to derive X.
Lemma 4.1. For S as defined in Theorem 4.1, conv(S) ⊆ projz Q ⊆ cl conv(S).
Proof. We first show that if z ∈ conv (⋃n
i=1 Si), it can be extended to a point that belongs
to Q by suitably defining (λ, u). By Assumption (A2), this proves the first inclusion. If
z ∈ conv (⋃n
i=1 Si), then, by Assumption (A1), there exist λi and z′i such that
z = (z1, . . . , zi, . . . , zn) =n∑
i=1
λiL(i, z′i),
where, for each i, λi ≥ 0, L(i, z′i) ∈ conv(Si), and∑n
i=1 λi = 1. By Assumption (A3), the
points L(i, z′i) ∈ conv(Si) can be extended to L(i, (z′i, u′i)) ∈ Ai. Let u =
∑ni=1 λiL(i, u′
i)
so that (z, u) =∑n
i=1 λiL(i, (z′i, u′i)). We reindex Si so that the sets containing the points
associated with non-zero multipliers are indexed from 1 to t. Then, λi > 0 for i = 1, . . . , t
and∑t
i=1 λi = 1. Observe that λiz′i = zi and λiu
′i = ui. Since Ri (1) = proj(zi,ui)
Ai, it
63
follows that (z′i, u′i) ∈ Ri (1) for each i ∈ {1, . . . , t}, and, therefore,
tjii (z′i, u
′i) ≥ 1 ∀ji ∈ Ji
vkii (z′i, u′i) ≥ −1 ∀ki ∈ Ki
tjii (z′i, u
′i) + vkii (z′i, u
′i) ≥ 0 ∀ji ∈ Ji,∀ki ∈ Ki
tjii (z′i, u
′i) ≥ 0 ∀ji ∈ Ji
wlii (z
′i, u
′i) ≥ 0 ∀li ∈ Li.
After substituting (z′i, u′i) =
(ziλi, ui
λi
)for each i ∈ {1, . . . , t} and multiplying both sides of
the inequalities by the positive value λi, we obtain:
λitjii
(ziλi
,ui
λi
)≥ λi ∀ji ∈ Ji
λivkii
(ziλi
,ui
λi
)≥ −λi ∀ki ∈ Ki
λitjii
(ziλi
,ui
λi
)+ λiv
kii
(ziλi
,ui
λi
)≥ 0 ∀ji ∈ Ji,∀ki ∈ Ki
λitjii
(ziλi
,ui
λi
)≥ 0 ∀ji ∈ Ji
λiwlii
(ziλi
,ui
λi
)≥ 0 ∀li ∈ Li.
The above argument can be used to express the convex hull of any disjunctive collection
of convex sets by introducing the λ variables; see Theorem 1 of Ceria and Soares [31].
However, because tjii , vkii , and wli
i are positively-homogeneous by Assumption (A3) and
λi > 0, the above system of inequalities can be rewritten as:
tjii (zi, ui) ≥ λi ∀ji ∈ Ji
vkii (zi, ui) ≥ −λi ∀ki ∈ Ki
tjii (zi, ui) + vkii (zi, ui) ≥ 0 ∀ji ∈ Ji,∀ki ∈ Ki
tjii (zi, ui) ≥ 0 ∀ji ∈ Ji
wlii (zi, ui) ≥ 0 ∀li ∈ Li,
64
which implies that (zi, ui) ∈ Ri (λi). Therefore, it follows that, for each i ∈ {1, . . . , t},(λi, zi, ui) is such that λi > 0 and (zi, ui) ∈ Ri (λi). Additionally, we set (zi, ui) = 0 for
t < i ≤ n. Since tjii (0, 0) = λtjii(0λ, 0λ
)for λ > 0, it follows that tjii (0, 0) = 0. Similarly,
for all i, ji ∈ Ji, ki ∈ Ki, and li ∈ Li, tjii (0, 0) = wli
i (0, 0) = vkii (0, 0) = 0. It follows
that (0, 0) ∈ Ri (0). In other words, for each i ∈ N , (λi, zi, ui) is such that λi ≥ 0 and
(zi, ui) ∈ Ri (λi). Therefore, (λ, z, u) ∈ Q.
Now, we show that if (λ, z, u) ∈ Q then z ∈ cl conv (⋃n
i=1 Si). Again by Assumption
(A2), this proves the second inclusion. Clearly, if (λ, z, u) ∈ Q and λi > 0, then by
positive homogeneity of tjii , vkii , and wli
i , it follows that (ziλi, ui
λi) ∈ Ri (1). As before,
then 1λiL(i, (zi, ui)) ∈ Ai. Assume without loss of generality, by reindexing Si if
necessary, that λi > 0 for i = 1, . . . , t and λi = 0 for i = t + 1, . . . , n. Then, it
follows easily that L ({1, . . . , t}, (z1, u1, . . . , zt, ut)) ∈ conv(⋃n
i=1Ai) since it can be
expressed as a convex combination of points in⋃t
i=1Ai. Since projz conv(⋃n
i=1Ai) ⊆conv(
⋃ni=1 projz Ai) and, by Assumption (A3), projz Ai ⊆ cl conv(Si), it follows that
L ({1, . . . , t}, (z1, . . . , zt)) ∈ conv(⋃n
i=1 cl conv(Si)) ⊆ cl conv (⋃n
i=1 Si). Now, since λt+1 = 0,
then by Assumption (A4), it follows that L(t + 1, zt+1) ∈ 0+(cl conv (⋃n
i=1 Si)). Therefore,
L ({1, . . . , t+ 1}, (z1, . . . , zt+1)) ∈ cl conv (⋃n
i=1 Si). By induction, z ∈ cl conv (⋃n
i=1 Si).
Lemma 4.1 deals with disjunctive sets and is inspired by the work in disjunctive
programming. We next describe the differences in our approach, which, although subtle,
play a significant role in obtaining our results. First observe that a significant emphasis
in the disjunctive programming literature is on facial disjunctive programs, see §6 in
Balas [15], since mixed 0−1 programs can be expressed in this form. It should be noted
that the disjunctive problem defined in Theorem 4.1 is not necessarily facial. In fact, the
disjunctions Si may lie in the interior of the convex hull (see Example 4.1 and Figure
4-1(b)). Nevertheless, this first step resembles Theorem 3.3 in Balas [16] for linear
disjunctive sets or Theorem 1 in Ceria and Soares [31] for convex disjunctive sets. We
however emphasize that the first step also exploits Assumption (A3) in which we assume
65
that we know positively homogeneous inequality representations of the sets Si to produce
a simplified high-dimensional representation of the convex hull; see (4–3).
Now, we carry out the second step of the proof in Lemma 4.2. In particular, we
prove that the projection of Q onto the space of (z, u) variables is X, whose closed-form
expression was already provided in (4–2).
Lemma 4.2. X = projz,u Q.
Proof. The proof proceeds by induction. Given two disjoint subsets A and B of N , we
consider
W =
(λA, λB, λA∪B, zA, uA, zB, uB)
∣∣∣∣∣∣∣∣∣∣
λA ≥ 0, (zA, uA) ∈ RA (λA)
λB ≥ 0, (zB, uB) ∈ RB (λB)
λA + λB = λA∪B
,
and
P ={(λA∪B, zA∪B, uA∪B)
∣∣∣ λA∪B ≥ 0, (zA∪B, uA∪B) ∈ RA∪B (λA∪B)}.
We first show that if zA∪B = (zA, zB) and uA∪B = (uA, uB), then P is the set obtained
when λA and λB are projected out from W . Note that since A and B are disjoint and
zA∪B ∈ R|∑
i∈A di+∑
i∈B di| = R|∑
i∈A di| × R|∑
i∈B di|, the definitions of zA∪B and, similarly,
uA∪B are dimensionally consistent. We first substitute λB = λA∪B−λA and then project λA
out using Fourier-Motzkin elimination. Note that λA appears linearly in all the inequalities
defining W . Therefore, we are able to use a procedure similar to Theorem 1.4 in [141]. We
substitute λB = λA∪B − λA in W to obtain:
λA ≥ 0
(zA, uA) ∈ RA (λA)
λA∪B − λA ≥ 0
(zB, uB) ∈ RB (λA∪B − λA) .
66
On the one hand, note that the inequalities
tjii (zi, ui) + vkii (zi, ui) ≥ 0 ∀i ∈ A ∪B, ∀ji ∈ Ji,∀ki ∈ Ki (4–4)
tjii (zi, ui) ≥ 0 ∀i ∈ A ∪B, ∀ji ∈ Ji (4–5)
wlii (zi, ui) ≥ 0 ∀i ∈ A ∪B, ∀li ∈ Li (4–6)
remain untouched during projection since they are independent of λA. On the other hand,
the inequalities containing λA can be rewritten as:
min
∑i∈A
tjii (zi, ui)
λA∪B + minB′⊆B
∑
i∈B′vkii (zi, ui)
≥ λA ≥ max
λA∪B −∑i∈B
tjii (zi, ui)
− minA′⊆A
∑
i∈A′vkii (zi, ui)
so that Fourier-Motzkin elimination is simple to perform. Observe that the constraints
λA∪B − λA ≥ 0 and λA ≥ 0 are represented in the above system respectively when A′ = ∅and B′ = ∅. Projecting λA out of the system, we obtain:
∑i∈A∪B
tjii (zi, ui) ≥ λA∪B ∀(ji)i∈A∪B ∈∏
i∈A∪BJi (4–7)
∑i∈A
tjii (zi, ui) +∑
i∈A′vkii (zi, ui) ≥ 0 ∀A′ ⊆ A,∀(ji)i∈A ∈
∏i∈A
Ji,∀(ki)i∈A′ ∈∏
i∈A′Ki (4–8)
∑i∈B
tjii (zi, ui) +∑
i∈B′vkii (zi, ui) ≥ 0 ∀B′ ⊆ B, ∀(ji)i∈B ∈
∏i∈B
Ji,∀(ki)i∈B′ ∈∏
i∈B′Ki (4–9)
∑
i∈A′∪B′vkii (zi, ui) ≥ −λA∪B ∀B′ ⊆ B, ∀A′ ⊆ A,∀(ki)i∈A′∪B′ ∈
∏
i∈A′∪B′Ki. (4–10)
Inequalities (4–4) for i ∈ A′ and (4–5) for i ∈ A\A′ imply that (4–8) is redundant.
Similarly, (4–9) can be shown to be redundant. Observe that λA∪B ≥ 0 is represented in
(4–10) by selecting A′ = B′ = ∅. Therefore, the set obtained by projecting λA and λB
out of W is given by (4–4), (4–5), (4–6), (4–7), and (4–10), which is exactly the definition
67
of P . By applying this result sequentially with A = {1, . . . , i′} and B = {i′ + 1} and
increasing i′ from 1 to n− 1, we obtain that projz,u Q = RN(1) = X.
Lemma 4.2 projects a higher-dimensional representation of the convex hull to the
space of the original variables. For linear systems, such a projection can be obtained
algorithmically using the wrapping procedure, see Fukuda et al. [55]; the Fourier-Motzkin
procedure, see Ziegler [141]; or the extreme-ray characterization of the projection cone, see
Sections 1, 2 and 5 of Balas [18] for a discussion of projection in the context of disjunctive
programming. However, the projected set is rarely described in closed-form. Using
Assumption (A1) in which we assume that the sets we convexify are orthogonal, we show
that the projection can be obtained in closed-form, despite the fact that S is nonlinear; see
(4–2).
The proof of Theorem 4.1 is now straightforward.
Proof of Theorem 4.1. It suffices to show that conv(S) ⊆ projz Q = projz X ⊆ cl conv(S)
where the inclusions follow from Lemma 4.1 and the equality follows from Lemma 4.2.
The proof exposes some differences between Theorem 4.1 and the results of Balas [15].
Although it is clear in Balas [15], Balas et al. [20], and Balas [18] that valid inequalities
for the convex hull of the disjunctive union of polyhedral sets can be obtained by
projecting down its high-dimensional representation onto the initial space of variables, this
projection is usually not performed explicitly. Instead, with orthogonal disjunctions and
positively-homogeneous functions, we show in Lemma 4.2 and the proof of Theorem 4.1
that Fourier-Motzkin elimination can be used to obtain a closed-form expression of the
convex hull in the space of the original variables. Further, earlier studies recommend
solving a cut generation linear program to generate valid inequalities for separating
solutions that do not belong to the convex hull of the disjunctive union. In contrast, it is
straightforward to find an inequality that separates X from a point that does not belong
to X.
68
Next, we illustrate the use of Theorem 4.1 in deriving the convex hulls of several
simple orthogonal disjunctive sets. In particular, we describe a situation where there exists
an i′ ∈ N such that Ji′ = ∅. Then, it follows by Assumption (A3) that 0 ∈ cl conv(S).
In other words, X cannot include any inequality of the form∑
i∈N tjii (zi, ui) ≥ 1. Indeed,
since Ji′ = ∅, it follows that ∏ni=1 Ji = ∅.
Example 4.1. Consider first an instance where, for each i, Ji 6= ∅. In particular,
consider S ⊆ R2+, defined as S = S1 ∪ S2, where S1 = {(z1, 0) | 1 ≤ z1 ≤ 2} and
S2 = {(0, z2) | z2 ≥ 1}. It can be easily verified that
conv(S) ={(z1, z2)
∣∣∣ z1 + z2 ≥ 1, z1 ≥ 0, z1 < 2, z2 ≥ 0}∪ {(2, 0)} .
We now apply the convexification tool of Theorem 4.1 to S and derive a set X that con-
tains conv(S) but is no larger than cl conv(S). First, we verify that the set S satisfies
the assumptions of Theorem 4.1. Clearly, Assumptions (A1) and (A2) hold by the defi-
nition of S. Next, it is easy to verify that conv(S1) ={(z1, 0) | z1 ≥ 1,−1
2z1 ≥ −1
}and
conv(S2) = {(0, z2) | z2 ≥ 1}. Since z1, −12z1, and z2 are linear, and therefore, positively-
homogeneous, Assumption (A3) clearly holds. Finally, since C1 = {(0, 0)} ⊆ 0+(cl conv(S))
and C2 = {(0, z2) | z2 ≥ 0} ⊆ 0+(cl conv(S2)) ⊆ 0+(cl conv(S)), then Assumption (A4) also
holds. Applying Theorem 4.1, we obtain that
X ={(z1, z2)
∣∣∣ z1 + z2 ≥ 1, z1 ≤ 2, z1 ≥ 0, z2 ≥ 0}.
In fact, it is apparent for this example that X = cl conv(S); see Figure 4-1(a).
Consider now instances where J2 = ∅. In particular, let the set S ′ ⊆ R2 be defined
as S ′ = S ′1 ∪ S ′
2 where S ′1 = {(z1, 0) | z1 ≥ 1} and S ′
2 = {(0, z2) | z2 ≥ −1}. Then, itfollows easily that conv(S ′) = {(z1, z2) | z1 ≥ 0, z2 > −1} ∪ (0,−1); see Figure 4-1(b).
Theorem 4.1 yields X ′ = {(z1, z2) | z1 ≥ 0, z2 ≥ −1} which is cl conv(S ′). Similarly, now
consider S = S ′′1 ∪ S ′′
2 where S ′′1 = {(z1, 0) | z1 ≥ −1} and S ′′
2 = {(0, z2) | z2 ≥ −1}. Then,
conv(S ′′) ={(z1, z2)
∣∣∣ z1 + z2 ≥ −1, z1 > −1, z2 > −1}∪ (−1, 0) ∪ (0,−1),
69
see Figure 4-1(c). In this case, Theorem 4.1 yields
X ′′ ={(z1, z2)
∣∣∣ z1 ≥ −1, z2 ≥ −1, z1 + z2 ≥ −1}
which is cl conv(S ′′).
0
0.5
1
1.5
2
2.5
3
0 0.5 1 1.5 2 2.5 3
z2
z1
conv(S)
S1
S2
(a)
-2
-1
0
1
2
3
-2 -1 0 1 2 3
z2
z1
conv(S′)
S′
1
S′
2
(b)
-2
-1
0
1
2
3
-2 -1 0 1 2 3
z2
z1
conv(S′′)
S′′
1
S′′
2
(c)
Figure 4-1. Illustration of Theorem 4.1 with (a) J1 6= ∅, J2 6= ∅ (b) J2 = ∅ (c) J1 = J2 = ∅
Example 4.1 shows different instances where conv(S) ( projz X. In Example 4.2, we
illustrate that, in some cases, projz X might be strictly contained in cl conv(S). Together,
these examples show that projz X can be different from conv(S) and cl conv(S) and, in
that sense, the result of Theorem 4.1 is as tight as possible.
Example 4.2. Consider the set S =⋃n
i=1 Si, where
Si = projz
{L (i, (zi, ui)) ∈ R2n
+
∣∣∣ √ziui ≥ 1}=
{L(i, zi)
∣∣∣ zi > 0}.
Clearly, Assumptions (A1) and (A2) hold by the definition of S. Since√ziui is positively-
homogeneous, Assumption (A3) is also satisfied. Here,
projz
Ci = projz
{L (i, (zi, ui)) ∈ R2n
+
∣∣∣ √ziui ≥ 0}=
{L(i, zi)
∣∣∣ zi ≥ 0}⊆ 0+ (cl conv(S)) .
70
Therefore, Assumption (A4) holds. Applying Theorem 4.1, we obtain that
X =
{(z, u) ∈ R2n
+
∣∣∣n∑
i=1
√ziui ≥ 1
}.
If, for any i, zi > 0 then there exists u such that (z, u) ∈ X. Further, for all u, it is
easy to see that (0, u) 6∈ X. Therefore, projz X ={z ∈ Rn
+ | ∑ni=1 zi > 0
}. This example
illustrates that if projz Ai is not closed then projz X may not be closed either and that, in
some cases, projz X ( cl conv(S).
In the above example, we exploit the fact that the sets projz Ai are not closed to show
that projz X may not be closed either. Instead, if, for all i, the sets projz Ai are closed
then, as shown in the following corollary, projz X is closed as long as recessions directions
are well-behaved.
Corollary 4.1. If, in addition to the assumptions of Theorem 4.1, projz Ai is a closed set
and projz Ci = 0+(cl conv(Si)) ∀i ∈ N , then projz X = cl conv(S).
Proof. Since the sets Si are orthogonal, there do not exist vectors ψi = L(i, zi) ∈projz Ci, not all zero, such that
∑ni=1 ψi = 0. Define Ti(λi) = λi cl conv(Si) for
λi > 0 and Ti(0) = 0+(cl conv(Si)). Then, by Theorem 9.8 in [102], it follows that⋃n
i=1 {z | ∑ni=1 λi = 1, zi ∈ Ti(λi)}, denoted hereafter as T , equals cl conv(S). If z ∈ T ,
then there exists a λ such that zi ∈ Ti(λi). If λi > 0, then ziλi
∈ cl conv(Si), and
therefore, there exists a ui such that(
ziλi, ui
λi
)∈ Ai. On the other hand, if λi = 0,
there exists a ui such that (zi, ui) ∈ Ci. Since Ai and Ci (restricted to the space of zi
and ui variables) are Ri(1) and Ri(0) respectively, it follows that (λ, z, u) ∈ Q and so
z ∈ projz X and cl conv(S) ⊆ projz X. However, we have already shown in Theorem 4.1
that projz X ⊆ cl conv(S) and, therefore, projz X equals cl conv(S).
In Corollary 4.1, orthogonality plays a key role in identifying the closed convex
hull. In fact, orthogonality implies that adding the recession directions of cl conv(Si)
to cl conv(Sj) for j 6= i does not yield sets that are not closed; see Corollary 9.1.1 in
Rockafellar [102]. This fact is exploited in the proof of the corollary. In the absence of
71
orthogonality or polyhedrality, closedness can only be established under various technical
conditions; see Ceria and Soares [31].
The definition of X as in (4–2) provides a simple and complete description of
cl conv(S) in many practical situations. However, in certain cases, some of the inequalities
in (4–2) may be redundant. To illustrate this observation, we consider a situation
where the sets A′i = proj(zi,ui)
Ai are completely described by a finite number of linear
inequalities. We then show that when Theorem 4.1 is used to derive inequalities for
X using facet-defining inequalities for the sets A′i = proj(zi,ui)
Ai, then the resulting
inequalities are not always facet-defining for X. More precisely, let zi ∈ Rdi and
ui ∈ Rd′i . Assume that A′i are full-dimensional sets in Rdi+d′i . If, for each i and ji
(resp. ki), the inequalities tjii (zi, ui) ≥ 1 (resp. vkii (zi, ui) ≥ −1) are facet-defining for
A′i then
∑i∈N tjii (zi, ui) ≥ 1 (resp.
∑i∈I v
kii (zi, ui) ≥ −1 with I = N) is facet-defining
for X. Similarly, if, for some i and li, wlii (zi, ui) ≥ 0 is facet-defining for A′
i then it is
also facet-defining for X. However, the inequalities∑
i∈I vkii (zi, ui) ≥ −1 for I ( N ,
tjii (zi, ui)+vkii (zi, ui) ≥ 0, and tjii (zi, ui) ≥ 0 are not necessarily facet-defining. For example,
consider
S1 =
{(x, 0, 0)
∣∣∣ x ≥ 0,−1
2x ≥ −1
}
and
S2 =
{(0, y, z)
∣∣∣ y + z ≥ 1,−1
2y ≥ −1,−1
2z ≥ −1, y ≥ 0, z ≥ 0
}.
Then, the inequalities y + z − 12y ≥ 0 and y + z ≥ 0 are not facet-defining since they are
implied by y ≥ 0 and z ≥ 0. Similarly, the inequality −12y ≥ −1 is not facet-defining since
it is implied by −12x− 1
2y ≥ −1 and x ≥ 0.
We now discuss each of the assumptions of Theorem 4.1. We first turn our attention
to Assumption (A1). This assumption requires that the sets Si belong to linear subspaces
that are orthogonal to each other. A weaker assumption however suffices to prove the
theorem. Consider Li, for i ∈ {1, . . . , n}, to be linear subspaces of R∑n
i=1 di , where Li
has dimension di. Further, assume that a vector zi ∈ Li cannot be expressed as a linear
72
combination of vectors in {L1, . . . ,Li−1,Li+1, . . . ,Ln}. In this case, it is possible to
construct a matrix B whose columns form a basis for R∑n
i=1 di where the columns, that
are indexed from 1 +∑j−1
i=1 di to∑j
i=1 di, form a basis for Lj. Then, define new variables
s such that s = B−1z. If z ∈ Sj ⊆ Lj, it follows that sk 6= 0 only if 1 +∑j−1
i=1 di ≤k ≤ ∑j
i=1 di. Therefore, Theorem 4.1 now applies to the transformed space of s variables.
This observation leads to the following simple derivation of the split cut in mixed-integer
programming.
Example 4.3. Consider a polyhedral cone P = {x | Ax ≤ b}, where A ∈ Rn×n is an
invertible matrix. Let X be the set of points that satisfy the disjunction
πTx ≤ π10 ∨ πTx ≥ π2
0,
where π10 < π2
0. We are interested in deriving the convex hull of P ∩X. Observe that this
setting can be used to derive all split cuts; see Balas [12]. Introducing the slack variables µ
and defining γ = πTA−1, γ10 = γb− π2
0, and γ20 = γb− π1
0, we reduce the above problem into
one involving convexification of
M ={µ∣∣∣ µ ≥ 0, γµ ≤ γ1
0 ∨ γµ ≥ γ20
}.
We assume without loss of generality that, for each i, γi 6= 0. The reformulation of the
problem in the space of the slack variables, after suitable translation, is an example of the
orthogonalization discussed above. Here, µ corresponds to −s and x corresponds to z. The
matrix B equals A−1 and its columns are the extreme rays of P . If µ = 0 is feasible to
M, then conv(M) = {µ | µ ≥ 0} since µ ≥ 0 is the recession cone for M, whenever Mcontains a feasible point. Instead, if µ = 0 is not feasible to M, then γ1
0 < 0 and γ20 > 0.
Define pi =γ10
γiand qi =
γ20
γi. It follows that, for each i, exactly one of pi or qi is greater
than 0. Since µi ≥ 0 is a recession direction for conv(M) and the extreme points of M
73
have at most one non-zero, it follows that:
conv(M) = conv
{n⋃
i=1
{L(i, µi)
∣∣∣ µi ≥ max{pi, qi}}}
.
Now, applying Theorem 4.1, it follows that:
conv(M) =
{µ∣∣∣
n∑i=1
µi
max{pi, qi} ≥ 1, µ ≥ 0
}.
Substituting back µ, pi, and qi in the above, we obtain:
conv(M) =
x
∣∣∣∣n∑
i=1
(b− Ax)i
max{
πTA−1·i b−π2
0
πTA−1·i
,πTA−1
·i b−π10
πTA−1·i
} ≥ 1, Ax ≤ b
.
We next discuss Assumption (A3). This assumption requires that the convex hulls of
the sets Si be known, possibly in a higher dimensional space, and that the functions tjii ,
for all ji ∈ Ji, vkii , for all ki ∈ Ki, and wli
i , for all li ∈ Li, used in the description of the
convex hulls be positively-homogeneous. In the ensuing example, we show that a simple
transformation might suffice to convert the natural inequality description of conv(Si) into
one that uses positively-homogeneous functions. We also demonstrate that if the defining
functions are not positively-homogeneous, then (4–2) does not necessarily contain conv(S).
Example 4.4. Let S =⋃n
i=1 Si, where Si ={L (i, xi, yi) ∈ R2n
+
∣∣ xiyi ≥ r}and r > 0.
Clearly, Assumptions (A1) and (A2) hold by the definition of S. Since Si is already closed
and convex, cl conv(Si) = Si, i.e.,
cl conv(Si) =
{L(i, xi, yi) ∈ R2n
+
∣∣∣ 1rxiyi ≥ 1
}.
The above representation of cl conv(Si) does not directly satisfy Assumption (A3) since
1rxiyi is not a positively-homogeneous function of (xi, yi). However, cl conv(Si) may be
rewritten as
cl conv(Si) =
{L (i, xi, yi) ∈ R2n
+
∣∣∣√
1
rxiyi ≥ 1
},
74
an expression that uses the function,√
1rxiyi, which is positively-homogeneous in (xi, yi).
With this representation, Assumption (A3) is satisfied. Since
Ci ={L (i, xi, yi) ∈ R2n
+
∣∣∣ √xiyi ≥ 0}= 0+(cl conv(Si)),
Assumption (A4) is satisfied. Therefore, Theorem 4.1 implies that
X = cl conv(S) =
{(x, y) ∈ R2n
+
∣∣∣n∑
i=1
√xiyi ≥
√r
}.
Observe finally that the transformation to positively-homogeneous functions is necessary
and not an artifact of the proof technique. In fact, if we use the original definition of
cl conv(Si), when applying Theorem 4.1, and disregard the lack of positive-homogeneity,
the resulting set would be X ′ = {(x, y) ∈ R2n+ | ∑n
i=1 xiyi ≥ r}. The set X ′ is non-
convex and does not even contain conv(S). To see this, let r = 1 and n = 2. Note
that (x1, y1, x2, y2) = (0.5, 0.5, 0.5, 0.5) is expressible as a convex combination with equal
weights of (1, 1, 0, 0) ∈ S1 and (0, 0, 1, 1) ∈ S2. Therefore, (0.5, 0.5, 0.5, 0.5) belongs to
conv(S). However, it does not satisfy the defining inequality of X ′ whereas it does satisfy
the defining inequality of X.
If λitjii
(ziλi, ui
λi
)≤ tjii (zi, ui) for all λ ∈ (0, 1], then X still outer-approximates
cl conv(S). An intuitive explanation for this result is that, when performing Fourier-Motzkin
elimination, λitjii
(ziλi, ui
λi
)≤ tjii (zi, ui) ensures that X contains the closure convex hull of
the disjunctive union of Si, whereas λitjii
(ziλi, ui
λi
)≥ tjii (zi, ui) guarantees that X is
contained in cl conv (⋃n
i=1 Si). Similar statements can be made about vkii (zi, ui) and
wlii (zi, ui). The latter of these conditions will be explored further in Proposition 4.8 to
derive sufficient conditions that help verify a relaxed version of Assumption (A2).
We now turn our attention to Assumption (A4). At a first glance, this assumption
might appear technical and difficult to verify in practice. However, this is not the case.
We show next that by simply requiring that the functions tjii , vkii , and wli
i are concave, in
addition to being positively-homogeneous, Assumption (A4) is automatically satisfied.
75
Proposition 4.1. If, for all i, ji ∈ Ji, ki ∈ Ki, and li ∈ Li, the functions tjii , vkii , and wli
i ,
as defined in Theorem 4.1, are concave in addition to being positively-homogeneous, and
the sets Si are not empty, then projz Ci ⊆ 0+(cl conv (⋃n
i=1 Si)), i.e., Assumption (A4) is
satisfied.
Proof. Let L(i, zi) ∈ Si. By Assumption (A3), there exists ui such that L (i, (zi, ui)) ∈ Ai.
Consider L (i, (z′i, u′i)) ∈ Ci and α > 0. Then, by positive homogeneity and concavity of tjii ,
it follows that
tjii (zi + αz′i, ui + αu′i) ≥ tjii (zi, ui) + tjii (αz
′i, αu
′i) = tjii (zi, ui) + αtjii (z
′i, u
′i) ≥ tjii (zi, ui) ≥ 1.
The first inequality holds because of Theorem 4.7 in [102], the first equality because tjii s
are positively-homogeneous, the second inequality because L(i, (z′i, u′i)) ∈ Ci and α > 0,
and the last inequality because L (i, (zi, ui)) ∈ Ai. Similarly, vkii (zi + αz′i, ui + αu′i) ≥ −1
and wlii (zi + αz′i, ui + αu′
i) ≥ 0. Therefore, (zi + αz′i, ui + αu′i) ∈ Ai and so, for all α > 0,
L(i, zi + αz′i) ∈ cl conv(Si) ⊆ cl conv (⋃n
i=1 Si). Since L(i, zi) ∈ cl conv (⋃n
i=1 Si), it follows
by Theorem 8.3 in [102] that (0, z′i, 0) ∈ 0+(cl conv (⋃n
i=1 Si)).
The assumption that Si is not empty plays an important role in Proposition 4.1.
Consider for example
S1 =
(z1, z2, 0)
∣∣∣∣∣∣∣∣∣∣
z1 − z2 ≥ 1,
−z1 + z2 ≥ 1,
z1 ≥ 0, z2 ≥ 0
and
S2 = {(0, 0, z3) | z3 ≥ 1}.
Then, S1 is empty but C1 = {(z1, z2, 0) | z1 = z2, z1 ≥ 0, z2 ≥ 0} 6= ∅ = 0+(cl conv(S1)).
Clearly, in this case, Theorem 4.1 does not apply since Assumption (A4) does not hold.
76
Here,
X =
(z1, z2, z3)
∣∣∣∣∣∣∣∣∣∣∣∣∣
z1 − z2 + z3 ≥ 1,
−z1 + z2 + z3 ≥ 1,
z1 = z2,
z1 ≥ 0, z2 ≥ 0, z3 ≥ 0
,
which contains the ray (a, a, 1), where a > 0, whereas (a, a, 1) 6∈ S2 = cl conv(S1 ∪ S2).
We now show that concavity of tjii , vkii , and wli
i is not a severe restriction since the
convexity of a positively-homogeneous function’s upper-level set implies concavity over the
region of interest.
Proposition 4.2. If the upper-level set of a positively-homogeneous function is convex,
then the function is concave, wherever it is positive. More precisely, if W = {(z, u) |t(z, u) ≥ 1} is convex and t(z, u) is positively-homogeneous, then D = {(z, u) | t(z, u) > 0}is convex and t(z, u) is concave over D. If, in addition, cl(D) is locally simplicial or more
specially, polyhedral, and t(z, u) is continuous then t(z, u) is concave over cl(D).
Proof. If W is convex, then
WK ={(λ, x)
∣∣∣ λ > 0, x = λ(z, u), t(z, u) ≥ 1}
is the smallest convex cone containing {(1, x) | x ∈ W}. Exploiting the positive
homogeneity of t, we may rewrite WK as:
WK = {(λ, x) | λ > 0, t(x) ≥ λ} .
Now, D is the projection of WK in the space of x and is therefore convex. Further, the
hypograph of t(z, u) over D is {(r, x) | r ≤ t(x), x ∈ D} = {(r, x) | r ≤ λ ≤ t(x), λ > 0},which is convex if WK is convex. The last statement of the proposition follows from
Theorems 10.3 and 20.5 in [102].
Even when some of the technical assumptions of Theorem 4.1 are not satisfied, it
is often the case that X yields an outer-approximation of conv(S). To see this, observe
77
that Proposition 4.2 shows that the functions tjii , vkii , and wli
i are concave, if they are
positively-homogeneous, as is assumed in Theorem 4.1, and their upper-level sets are
convex. However, if concavity of these functions is known, then the outer-approximation of
conv(S) by projz X can be shown under relatively mild assumptions.
Proposition 4.3. Let S ⊆ R∑n
i=1 di and, for all i ∈ N , let Si ⊆ S. Also, suppose that
Assumption (A1) of Theorem 4.1 holds. Further, assume that projz Ai, where Ai is as
defined in (4–1), yields an outer-approximation of conv(Si) and that, for all i ∈ N , ji ∈ Ji,
ki ∈ Ki, and li ∈ Li, tjii (0, 0), v
kii (0, 0), and wli
i (0, 0) are non-negative. Then, projz(X),
where X is as defined in (4–2), outer-approximates⋃n
i=1 Si. If, in addition, Assumption
(A2) of Theorem 4.1 holds and X is convex (for example, if the functions tjii , vkii , and wli
i
are concave), then projz X ⊇ conv(S).
Proof. If Assumption (A1) is satisfied, then the sets Si, for i ∈ N , are orthogonal. It
can be easily verified that, if tjii (0, 0), vkii (0, 0), and wli
i (0, 0) are non-negative, then every
constraint defining X is valid for each Si, where i ∈ N . Therefore, projz X ⊇ ⋃ni=1 Si. If
(A2) is satisfied, conv(S) = conv (⋃n
i=1 Si). Further, if X is convex, so is projz X. Since
projz X ⊇ ⋃ni=1 Si, it follows that projz X ⊇ conv (
⋃ni=1 Si) = conv(S).
When the constituent functions tjii , vkii , and wli
i are concave, the result of Proposition
4.3 could also be derived using disjunctive programming. We verify Proposition 4.3
using this approach, since it more clearly reveals the source of the difference between the
outer-approximation of Proposition 4.3 and the convex hull identified in Theorem 4.1. For
example, one can assert that∑
i∈N tjii (zi, ui) ≥ 1, by simply noticing that if λi > 0 for
i ∈ {1, . . . , t} then:
1 =t∑
i=1
λi
≤t∑
i=1
λitjii
(ziλi
,ui
λi
)+
n∑i=t+1
tjii (zi, ui)
≤t∑
i=1
λi
(tjii
(ziλi
,ui
λi
)+
∑
i′∈N, i′ 6=i
tji′i′
(0
λi
,0
λi
))+
n∑i=t+1
tjii (zi, ui) ≤n∑
i=1
tjii (zi, ui),
(4–11)
78
where the first inequality follows by summing the inequalities λi ≤ λitjii
(ziλi, ui
λi
)for
i ∈ {1, . . . , t} and tjii (zi, ui) ≥ 0 for i ∈ {t + 1, . . . , n}, the second inequality follows
since tji′i′ (0, 0) ≥ 0, and the third inequality follows from the concavity of
∑ti=1 t
jii (zi, ui).
Similarly, it can be shown that∑t
i=1 vkii (zi, ui) ≥ −1 since −∑t
i=1 λi ≥ −1.
Proposition 4.3 provides a simple proof of the validity of the constraints defining X
for conv(S). The proof of Proposition 4.3 is similar to that of Theorem 3.1 and Remark
3.1.1 in Balas [15], although it is applied here to nonlinear inequalities. The main idea in
either case is that one can establish validity of a cut by establishing its validity for each
of the disjunctions. In fact, if the primary purpose of deriving X is to develop a convex
outer-approximation, then Proposition 4.3 can often replace Theorem 4.1. For example,
the convex hull description for the bilinear covering sets (derived in Proposition 4.9) can
be shown to yield a convex outer-approximation, if Proposition 4.3 is invoked instead of
Theorem 4.1 in the proof of the result. Nevertheless, the insights gained from Theorem 4.1
are very useful. For example, we illustrate next that the search for a representation of
conv(Si) using positively-homogeneous functions can substantially improve the relaxation.
This insight will play an important role in deriving strong relaxations for the bilinear
covering set.
Example 4.5. Consider S =⋃n
i=1 Si, where, for each i ∈ {1, . . . , n}, let Si ={L(i, zi) ∈ Rn
+
∣∣ √zi ≥ 1}. Proposition 4.3 shows that
X ′ =
{(z1, . . . , zn) ∈ Rn
+
∣∣∣∣∣n∑
i=1
√zi ≥ 1
}
is a convex outer-approximation of conv(S). Note that the square-root function used in
expressing Si is concave, but not positively-homogeneous. Instead, if Sis are represented
equivalently as
Si ={L(i, zi) ∈ Rn
+
∣∣ zi ≥ 1},
79
then Theorem 4.1 yields the convex hull of S, which is
X =
{(z1, . . . , zn) ∈ Rn
+
∣∣∣∣∣n∑
i=1
zi ≥ 1
}.
Clearly, by construction, X = conv(S) ⊆ X ′. Further, it can be seen that X ( X ′ when
n > 1 as the point(
1n2 , . . . ,
1n2
)belongs to X ′ but not to X. In this particular example,
the inclusion of X in X ′ can also be verified using the subadditivity of the square-root
function for non-negative variables. This example illustrates that it often helps to find
representations of convex hulls of Si using positively-homogeneous functions, even when
equivalent representations exist using concave functions.
As discussed in Example 4.5, if one can find a description of conv(Si) that uses
positively-homogeneous functions then one can apply Theorem 4.1 to identify the convex
hull of the orthogonal disjunctions, thus deriving a superior relaxation. Although natural
formulations of convex hulls for the orthogonal disjunctions might not use positively
homogeneous functions, the associated functions can often be transformed to satisfy this
property. Consider, for example, the case where an inequality describing the convex hull of
a restriction uses a function that is positively-homogeneous of pth order, i.e., an inequality
of the form tjii (zi, ui) ≥ 1 where, for any λi > 0, tjii (λizi, λiui) = λpi t
jii (zi, ui). Such an
inequality can be rewritten as sign(tjii (zi, ui)
) ∣∣tjii (zi, ui)∣∣ 1p ≥ 1, where sign(x) is 1 if x > 0,
is 0 if x = 0, and is −1 otherwise. Note that, in the modified form, the inequality only
uses a positively-homogeneous function and can therefore be used in the construction of
the convex hull.
More generally, a positive homogeneous description can be obtained by adding
one homogenizing variable for each orthogonal disjunction and expressing Ai using the
inequalities,
λitjii
(ziλi
,ui
λi
)≥ 1, ∀ji ∈ Ji,
λivkii
(ziλi
,ui
λi
)≥ −1, ∀ki ∈ Ki,
80
λiwlii
(ziλi
,ui
λi
)≥ 1, ∀li ∈ Li
along with the inequalities, λi ≥ 1 and −λi ≥ −1. However, this process suffers from the
drawback that it introduces new variables in the relaxation. Instead, it is possible to find a
separating inequality without increasing the problem dimension and, thereby, circumvent
the need to introduce new variables. First, we show that a separating inequality of X
can be found easily. Consider, for simplicity, the case of Theorem 4.1 where Ai is not an
extended formulation, i.e., it does not use the additional ui variables. The case where
Ai contains ui variables can be handled similarly. Now, consider a point z′ that does not
belong to cl conv(S). If it is possible to find, for all i, a function ti(zi) such that, for all zi,
ti(zi) ≥ inf{tji (zi)
∣∣ j ∈ Ji}, but ti(z
′i) = inf
{tji (z
′i)∣∣ j ∈ Ji
}, a vi(zi) such that, for all zi,
vi(zi) ≥ inf{vki (zi)
∣∣ k ∈ Ki
}, but vi(z
′i) = inf
{vki (z
′i)∣∣ k ∈ Ki
}, and a wi(zi) such that,
for all zi, wi(zi) ≥ inf{wl
i(zi)∣∣ l ∈ Li
}, but wi(z
′i) = inf l
{wl
i(z′i)∣∣ l ∈ Li
}, then using the
closed-form expression of X in (4–2), one can identify an inequality that separates z′ from
X. Observe that, when Ji, Ki and Li are finitely sized, the functions ti, vi and wi can be
found by choosing an index, j′i ∈ Ji, such that ti(zi) = tj′ii (zi), an index, k′
i ∈ Ki, such that
vi(zi) = vk′ii (zi) and an index, l′i ∈ Li, such that wi(zi) = w
l′ii (zi). Then, if an inequality
of the form∑
i∈N tjii (zi) ≥ 1 violates z′i, i.e.,∑
i∈N tjii (z′i) < 1, then
∑i∈N ti(z
′i) < 1 as
well, since, by the definition of ti, ti(z′i) ≤ tjii (z
′i) for all i. The inequality is valid since
ti(zi) ≥ 1 is valid for each Si. This is because if ti(zi) < 1 for some zi then there exists
a ji ∈ Ji such that tjii (zi) < 1. Now, observe that as long as a representation of each
Si uses positively-homogeneous functions (even if this representation requires infinitely
many inequalities), then the separation procedure described above can be used to develop
cl conv (⋃n
i=1 Si).
Consider Example 4.5 for a concrete demonstration of these ideas and, in particular,
the point(
1n2 , . . . ,
1n2
). If n > 1, this point does not belong to the convex hull, which as
shown in Example 4.5 is∑n
i=1 zi ≥ 1. Assume that we are not aware that each orthogonal
81
disjunction can be defined equivalently using zi ≥ 1, a representation that allows us to
easily identify the convex hull of the set. In the absence of such knowledge, we construct
the inequality∑n
i=1
√zi ≥ 1. Now, consider the linearization f(zi) =
1n+ n
2
(zi − 1
n2
) ≥ 1 of
√zi ≥ 1 at zi =
1n2 . Since f(zi) ≥ 1 is a relaxation of
√zi ≥ 1, we can use this inequality
to construct the outer-approximation using Theorem 4.1. The primary difference here is
that f(zi) ≥ 1, being linear, can be easily rewritten as n2zi ≥ 1 − 1
2nwhere the left hand
side is a positively homogeneous function. Then, Theorem 4.1 constructs the inequality∑n
i=1n2zi ≥ 1 − 1
2n. If n > 1, even though
∑ni=1
√zi ≥ 1 does not chop off
(1n2 , . . . ,
1n2
),
this point is cut off by the linearized inequality. Therefore, the first step of relaxation
helps tighten the inequality by exploiting positive homogeneity in the convexification step.
Observe that, in the preceding discussion, we did not find the separating inequality by
minimizing tji (zi) among all linearizations tji (zi) ≥ 1 of√zi ≥ 1. We now carry out this
procedure. We can rewrite{zi | √zi ≥ 1
}as
zi
∣∣∣∣∣∣∣∣∣∣
12√zi−zi
zi ≥ 1, ∀zi ∈ (0, 4),
1zi−2
√zizi ≥ −1, ∀zi ∈ (4,+∞),
zi ≥ 0,
,
where 12√zi−zi
zi ≥ 1 is the linearization of√zi ≥ 1 at zi for zi ∈ (0, 4) and 1
zi−2√zizi ≥ −1
is the linearization for zi ∈ (4,∞). Then, the tightest inequality is found by noticing that
12√zi−zi
is a minimized at zi = 1 for zi ∈ (0, 4) and that 1zi−2
√zi
↓ 0 (is non-negative and
approaches 0) as zi approaches +∞. Therefore, we can set ti(zi) = zi and vi(zi) = 0.
Then, we apply Theorem 2.2 to recover the convex hull of S, i.e.,
{z∣∣∣
n∑i=1
zi ≥ 1, zi ≥ 0 ∀i}.
Now, we discuss another technique that can be used to find representations of the
convex hull of each Si that uses positively-homogeneous functions but does not require
additional variables. The main idea is that one can homogenize the inequality using an
82
extra variable and then maximize the resulting function over the introduced variable to
derive a positively-homogeneous function describing the set. We illustrate this idea by
deriving a positively-homogeneous function that describes the following bilinear covering
set:
Q ={(x, y) ∈ R2
+
∣∣∣ axy + bx+ cy ≥ r}, (4–12)
where a, b, and c are assumed to be non-negative. We assume without loss of generality
that r > 0. Otherwise, Q = R2+. We may also assume without loss of generality that c ≥ b
and, consequently, assume that at least one of a and c is strictly positive. Then, for any
feasible (x, y), it follows that ax + c > 0. Therefore, Q ={(x, y) ∈ R2
+ | y ≥ r−bxax+c
}. First,
we verify that the inequality is convex. Let f(x) = r−bxax+c
. Since
∂2f
∂x2=
2a(bc+ ar)
(ax+ c)3
is nonnegative if x ≥ 0, Q is expressed as the intersection of the epigraph of a convex
function with the non-negative orthant. Therefore, Q is convex. Also, note that the
defining inequality of Q is not positively-homogeneous. We show how the above inequality
can be homogenized without introducing new variables in the formulation. To carry out
this transformation, we first homogenize the defining inequality, axy+bx+cy ≥ r, using an
additional variable h, that is restricted to be positive. This is accomplished by rewriting
the defining inequality of Q as axyh
+ bx + cy ≥ rh. Since h is positive, we can multiply
throughout by h, and express the above inequality as: axy + bxh + cyh ≥ rh2. Consider
Q′ = {(x, y, 1) | (x, y) ∈ Q}. The above positively-homogeneous inequality defines the
smallest closed convex cone that contains Q′ if h is restricted to be non-negative. Further,
if (x, y, h) satisfies the above inequality for some h ≥ 0, then for any h′ ∈ [0, h], (x, y, h′)
satisfies it as well. Therefore, Q can be described by the projection of the following set
axy + bxh+ cyh ≥ rh2 and h ≥ 1
83
in the space of the (x, y) variables. In order for (x, y, h) to satisfy the first inequality
above, h must be such that:
bx+ cy −√
(bx+ cy)2 + 4arxy
2r≤ h ≤ bx+ cy +
√(bx+ cy)2 + 4arxy
2r.
It can be easily verified that the functions bounding h are positively-homogeneous. In fact,
since the bounding functions on h are obtained from a positively-homogeneous constraint,
these functions must be positively-homogeneous. This follows because for each (x, y, h)
that satisfies a positively-homogeneous constraint and an arbitrary λ > 0, it must be that
(λx, λy, λh) satisfies the constraint as well. The lower bounding function is nonpositive.
Therefore, the set Q can be rewritten as:
η(x, y) =1
2
(bx+ cy +
√(bx+ cy)2 + 4arxy
)≥ r. (4–13)
We have thus expressed Q as the upper-level set of a positively-homogeneous function
without introducing new variables. In fact, since Proposition 4.2 asserts that a positively-
homogeneous function whose upper-level set is convex, is concave, it follows from the
convexity of Q that η(x, y) must be concave over the non-negative quadrant. In other
words, we have established the following result.
Proposition 4.4. Let Q = {(x, y) ∈ R2+ | axy + bx + cy ≥ r}, where a, b, c are
non-negative, and r is strictly positive. Then, Q has a convex description (upper level set
of a concave function) that uses positively-homogeneous functions. In particular,
Q ={(x, y) ∈ R2
+
∣∣∣ η(x, y) ≥ r},
where η(x, y) is as defined in (4–13).
4.3 Convex Extension Property
In this section, we study in more detail the convex extension property which forms
the basis for Assumption (A2) in Theorem 4.1. The convex extension property of interest
here is that the convex hull of S is determined by its restriction to certain orthogonal
84
spaces. This property clearly holds when S is defined as the union of orthogonal sets, Si,
for i ∈ {1, . . . , n}. But, it can often be established for sets that are not already defined
as such. In this section, we explore the occurrence of the convex extensions property in
these more surprising situations. For example, we show that the convex extension property
holds for mixed, pure, and continuous bilinear sets. In fact, we derive a fairly general set
of conditions that are sufficient to establish the convex extensions property. We show in
[117] that these conditions apply to large classes of polynomial covering sets and can be
used to better exploit variable bounds while constructing relaxations. We first formally
define the notion of a convex extension for orthogonal disjunctive sets. This definition is
adapted from Tawarmalani and Sahinidis [120].
Definition 4.2. Let Si ⊆ S for i ∈ N = {1, . . . , n}. We say that S has the convex
extension property for disjunctive sets Si if Assumption (A1) holds and if every point
z in S can be expressed as a convex combination of points χi in cl conv(Si) and a conic
combination of rays ψi in 0+(cl conv(Si)), i.e., for i ∈ I ⊆ N , there exist λi ≥ 0 and
µi ≥ 0, that satisfy∑
i∈I λi = 1, such that
z =∑i∈I
λiχi +∑i∈I
µiψi. (4–14)
The convex extension property in Definition 4.2 is more general than Assumption
(A2) in Theorem 4.1, in that it allows the use of non-negative multiples of recession
directions in the expression of z. Since χi +µi
λiψi ∈ cl conv(Si), it may seem that
the recession directions in (4–14) are not necessary. However, this is not true since λi
may be zero even when µi is not. This technicality is often important in applying our
result. Fortunately, it can be observed that even if Assumption (A2) is replaced with
(4–14), Theorem 4.1 holds with only slight modifications, as discussed below. Instead of
conv(S) = conv (⋃n
i=1 Si), we can only establish that (4–14) implies
cl conv(S) = cl conv
(n⋃
i=1
Si
). (4–15)
85
In fact, (4–15) is equivalent to (4–14). On the one hand, since, for each i ∈ {1, . . . , n},Si ⊆ S it follows that cl conv (
⋃ni=1 Si) ⊆ cl conv(S). On the other hand, since the sets Si
are orthogonal, by Theorem 9.8 in [102],
cl conv
(n⋃
i=1
Si
)=
⋃{λ1 cl conv(S1) + · · ·+ λn cl conv(Sn)
∣∣∣∣∣ λi ≥ 0+,n∑
i=1
λi = 1
},
(4–16)
where the notation λi ≥ 0+ means that λi cl conv(Si) is taken to be 0+ (cl conv(Si))
rather than {0} when λi = 0. Observe that (4–14) is another way to represent the
set on the right-hand-side of (4–16) since if λi > 0 then χi +µi
λiψi ∈ cl conv(Si).
Otherwise, ψi ∈ 0+ (cl conv(Si)). Now, if we assume (4–14), or equivalently, (4–15),
the proof of Theorem 4.1 shows that cl projz X = cl conv (⋃n
i=1 Si), and, therefore,
by (4–15), cl projz X = cl conv(S). In this case, Corollary 4.1 can often be used
to establish closedness of projz X. Note that projz Ai is closed whenever conv(Si) is
closed. Therefore, if conv(Si) is closed and projz Ci = 0+ (cl convSi), it follows that
projz X = cl conv(S). Since most practical situations only require the derivation of
cl conv(S), it suffices to establish (4–14) instead of Assumption (A2) in Theorem 4.1.
Similarly, if Assumption (A2) is replaced with (4–14) in Proposition 4.3, it can be easily
established that cl conv(S) ⊆ cl projz X. This is because cl conv(S) = cl conv (⋃n
i=1 Si) ⊆cl conv (projz X) = cl projz X, where the first equality follows from the equivalence of
(4–14) and (4–15), the first containment since⋃n
i=1 Si ⊆ projz X, and the last equality
since projz X is convex.
We next present a nontrivial set for which it can be proven from first principles that
the convex extension property holds for orthogonal disjunctive sets. This set appears in
a nonconvex formulation of the trim-loss problem proposed by Harjunkoski et al. [64].
The model is designed to determine the best way to cut a finite number of large rolls of
a raw material into smaller products using a certain number of cutting patterns. Let I
be the index set of products and J be the index set of the cutting patterns that are to be
chosen. The demand for a product i is known a priori and is denoted by ni,order. For each
86
(i, j) ∈ I × J , let nij ∈ Z+ be the decision variable that specifies the number of products
of type i produced in cutting pattern j and, for each j ∈ J , let mj ∈ Z+ be the number
of times the cutting pattern j is used. The following bilinear constraints model that the
demand for each product is met:
∑j∈J
mjnij ≥ ni,order, for i ∈ I. (4–17)
In Proposition 4.5, we show that the bilinear integer sets defined by the constraint (4–17)
satisfy the convex extension property for orthogonal disjunctive sets. We use this result
along with Theorem 4.1 to obtain the convex hull of integer bilinear covering sets in
Proposition 4.6.
Proposition 4.5. Consider a bilinear integer covering set
BI ={(x1, y1, x2, y2) ∈ Z2
+ × Z2+
∣∣ x1y1 + x2y2 ≥ r}.
where r > 0. Then, BI has the convex extension property (4–14) with respect to the
orthogonal disjunctive sets
BI1 =
{(x1, y1, 0, 0) ∈ Z2
+ × Z2+ | x1y1 ≥ r
},
BI2 =
{(0, 0, x2, y2) ∈ Z2
+ × Z2+ | x2y2 ≥ r
}.
Proof. Let (x1, y1, x2, y2) ∈ BI . We show that there exist (i) certain subsets I and I ′
of {1, 2}, (ii) for each i ∈ I, a finite ji, (iii) for each i ∈ I and j ∈ {1, . . . , ji}, a point
χi,j ∈ BIi , and (iv) for each i ∈ I ′, a ray ψi of B
Ii , such that
(x1, y1, x2, y2) =∑i∈I
ji∑j=1
λi,jχi,j +∑
i∈I′µiψi, (4–18)
where the multipliers are such that (a)∑
i∈I∑ji
j=1 λi,j = 1, (b) for each i ∈ I and
j ∈ {1, . . . , ji}, λi,j ≥ 0, and (c) for each i ∈ I ′, µi ≥ 0.
87
We assume without loss of generality that x1 ≤ y1 ≤ y2 and x2 ≤ y2 since the
variables x1, y1, x2, and y2 can be renamed such that the largest variable is called y2 and
the largest variable in the other pair is called y1. Note first that if x1 = 0, it suffices to
choose I = {2}, I ′ = {1}, j2 = 1 with χ2,1 = (0, 0, x2, y2) and ψ1 = (0, 1, 0, 0) to show
that (4–14) holds. Therefore, we assume in the remainder of this proof that x1 ≥ 1 and,
consequently, x1y1 ≥ 1. We consider two cases.
Case 1: x2 ≥ x1y1. In this case, we choose I = {1, 2}, I ′ = {2}, and j1 = j2 = 1. Consider
the points χ1,1 = ((y2 + 1) x1, (y2 + 1) y1, 0, 0) and χ2,1 = (0, 0, x2, y2 + 1), and the ray
ψ2 = (0, 0, 1, 0). Clearly, χ1,1 ∈ BI1 , since (y2 + 1)2x1y1 ≥ x1y1 + y22x1y1 ≥ x1y1 + y22 ≥
x1y1 + x2y2 ≥ r. Similarly, χ2,1 ∈ BI2 , since x2 (y2 + 1) ≥ x2y2 + x2 ≥ x2y2 + x1y1 ≥ r.
It is easily verified that
(x1, y1, x2, y2) =1
y2 + 1χ1,1 +
y2y2 + 1
χ2,1 +x2
y2 + 1ψ2
which shows that (4–18) is satisfied.
Case 2: x2 ≤ x1y1 − 1. In this case, we choose I = {1, 2}, I ′ = {1, 2}, j1 = 2, and
j2 = 1. Consider the points χ1,1 = (x1 + α, y1, 0, 0), χ1,2 = (x1, y1 + β, 0, 0),
χ2,1 = (0, 0, x2, y2 + δ), and the rays ψ1 = (0, 1, 0, 0), ψ2 = (0, 0, 1, 0), where
α =⌈x2y2y1
⌉, β =
⌈x2y2x1
⌉, and δ =
⌈x1y1x2
⌉. It follows from the way α, β, and δ are
defined that χ1,1 and χ1,2 belong to BI1 whereas χ2,1 belongs to BI
2 . Further, define
λ1,1 =x1y2
α(y2 + δ), λ1,2 =
αδ − x1y2α (y2 + δ)
, λ2,1 =y2
y2 + δ,
µ1 =αy1y2 + βx1y2 − αβδ
(y2 + δ)α, µ2 =
δx2
y2 + δ.
The above multipliers were computed by projecting out λ1,2, λ2,1, µ1, and µ2 from
(4–18) and the inequalities that these multipliers must satisfy. Then, λ1,1 was set
to its lowest admissible value. Instead of following this approach we show, as is
sufficient, that setting the multipliers at the above values establishes the convex
extensions property. It is easy to verify that (4–18) is satisfied. Since it is clear that
88
λ1,1 + λ1,2 + λ2,1 = 1, we only need to check that all the multipliers are non-negative.
Clearly, λ1,1 ≥ 0, λ2,1 ≥ 0 and µ2 ≥ 0. Since αδ =⌈x2y2y1
⌉ ⌈x1y1x2
⌉≥ x1y2, it follows
that λ1,2 ≥ 0. Observe that µ1 ≥ 0 if and only if αβδ ≤ αy1y2 + βx1y2. Hence, it
suffices to prove that αβδ ≤ αy1y2 + βx1y2.
We consider two cases:
Case 2.1: x2 = 1. In this case, α =⌈y2y1
⌉, β =
⌈y2x1
⌉, and δ = x1y1. There exist
fα, fβ ∈ [0, 1) such that α = y2y1
+ fα and β = y2x1
+ fβ. We observe that
αβδ =
(y2y1
+ fα
)(y2x1
+ fβ
)x1y1
= y1y2
(y2y1
+ fα
)+ x1y2
(y1y2fαfβ + fβ
)
≤ y1y2
(y2y1
+ fα
)+ x1y2
(y2x1
+ fβ
)
= αy1y2 + βx1y2
where the inequality holds because x1 ≤ y1 ≤ y2 implies x1y1fαfβ ≤ x1y1 ≤ y22.
Case 2.2: x2 ≥ 2. For (u, v) ∈ Z2+, we define l(u, v) = v − l where l is the only
integer in the interval {0, . . . , v − 1} that is such that u = qv + l for some
q ∈ Z+, i.e., l is the remainder when u is divided by v. Using this notation, it is
easy to verify that α = x2y2+l(x2y2,y1)y1
, β = x2y2+l(x2y2,x1)x1
, and δ = x1y1+l(x1y1,x2)x2
.
Now observe that:
δ
y2=
x1y1 + l (x1y1, x2)
x2y2≤ x1y1 + x2 − 1
x2y2
=x1y1x2y2
(1 +
x2 − 1
x1y1
)
≤ x1y1x2y2
(1 +
x2 − 1
x2 + 1
)
=1
x2y2
(x1y11 + 1
x2
+x1y11 + 1
x2
)
≤ 1
x2y2
(x1y1
1 + y1−1x2y2
+x1y1
1 + x1−1x2y2
)
89
≤ x1y1x2y2 + l (x2y2, y1)
+x1y1
x2y2 + l (x2y2, x1)=
x1
α+
y1β,
where the first inequality holds because l (x1y1, x2) ≤ x2 − 1, the second
inequality because x2 ≤ x1y1 − 1, the third inequality holds since y1 ≤ y2
implies y1−1y2
≤ 1 and x1 ≤ y2 implies that x1−1y2
≤ 1, and the fourth inequality
holds since y1 − 1 ≥ l (x2y2, y1) and x1 − 1 ≥ l (x2y2, x1). Therefore, αβδ ≤αy1y2 + βx1y2.
For (x1, y1, x2, y2) ∈ BI , (4–18) is satisfied, and, therefore, (4–14) holds for BI .
We now apply the result of Proposition 4.5 in conjunction with Theorem 4.1 to obtain
the following result that describes the convex hull of (4–17).
Proposition 4.6. Let
BI =
{(x, y) ∈ Zn
+ × Zn+
∣∣∣∣∣n∑
i=1
xiyi ≥ r
}, (4–19)
where r > 0 and, for each i ∈ {1, . . . , n}, define:
BIi =
{(x, y) ∈ BI
∣∣ (xj, yj) = (0, 0),∀j 6= i}.
Let the convex hull of BIi be represented by:
conv(BIi ) =
{L(i, xi, yi) ∈ Rn
+ × Rn+
∣∣ lj(xi, yi) ≥ 1,∀j ∈ J},
where lj(xi, yi) is a linear function of (xi, yi). Then,
conv(BI) =
{(x, y) ∈ Rn
+ × Rn+
∣∣∣∣∣n∑
i=1
lji(xi, yi) ≥ 1,∀(ji)ni=1 ∈n∏
i=1
J
}. (4–20)
Proof. We prove this result by applying Theorem 4.1. Let zi = (xi, yi). Assumption
(A1) holds by the definition of BIi . The convex extension property, (4–14), follows from a
sequential application of Proposition 4.5. Assumption (A3) is satisfied since the functions
lj(xi, yi) are positively-homogeneous. Further, since 0+(cl conv
(BI
i
))= Rn
+×Rn+, it follows
90
that
Ci ={L (i, xi, yi) ∈ Rn
+ × Rn+
∣∣ lj(xi, yi) ≥ 0, ∀j ∈ J} ⊆ 0+
(cl conv
(BI
i
)).
Therefore, Assumption (A4) holds. Now, by Theorem 4.1 and the discussion following
Definition 4.2, it follows that
cl conv(BI
)= X =
{(x, y) ∈ Rn
+ × Rn+
∣∣∣∣∣n∑
i=1
lji(xi, yi) ≥ 1,∀(ji)ni=1 ∈n∏
i=1
J
},
where the closure operation is not needed on X since it is a closed set, being an
intersection of closed half-spaces. In fact, X is polyhedral, since there are only finitely
many half-spaces in its expression. Now, consider the closed sets
BI′i =
{(x, y) ∈ Z2n
+
∣∣ xiyi ≥ r}.
Observe that BIi ⊆ BI′
i ⊆ BI . Now, by Corollary 9.8.1 in [102], conv(⋃n
i=1BI′i
)is closed.
Since
conv(BI
) ⊆ cl conv(BI
) ⊆ cl conv
(n⋃
i=1
BI′i
)= conv
(n⋃
i=1
BI′i
)⊆ conv
(BI
),
where the second containment holds since BIi ⊆ BI′
i and because the discussion following
Definition 4.2 argues that cl conv(BI) = cl conv(⋃n
i=1BIi
), the first equality since
conv(⋃n
i=1BI′i
)is closed, and the third containment since BI′
i ⊆ BI . Therefore, the
equality holds throughout, and the result follows.
Observe that, even though conv(BI) is closed, conv(⋃n
i=1BIi
)is not closed. Observe
also that, if each inequality lj(xi, yi) ≥ 1 is facet-defining for conv(BIi ), then all the
inequalities of the form∑n
i=1 lji(xi, yi) ≥ 1 are facet-defining for conv(BI). This is because,
if L(i′, x′i′ , y
′i′) is tight for l
ji′ (xi′ , yi′) ≥ 1, then it is also tight for∑n
i=1 lji(xi, yi) ≥ 1. In
other words, the inequality∑n
i=1 lji(xi, yi) ≥ 1 has two tight points for each i′ ∈ {1, . . . , n},
yielding a total of 2n tight points. Since these points belong to orthogonal subspaces and
the origin does not satisfy lji(xi, yi) ≥ 1, they are affinely independent points, showing that
91
the inequality is facet-defining. Consequently, Proposition 4.6 shows that conv(BI) has
exponentially many facets. In particular, if BIi has |J | facets, there are |J |n inequalities in
the description of conv(BI). We note, however, that separation is not difficult to perform
as the coefficients of each pair of variables can be determined independently. Since there
is an obvious pseudo-polynomial algorithm to compute the facets of conv(BIi ), it is clearly
possible to separate the facets of conv(BI) in pseudo-polynomial time.
Example 4.6. Consider the set
BI ={(x, y) ∈ Z2
+ × Z2+
∣∣ x1y1 + x2y2 ≥ 10}. (4–21)
It is easily verified that for both i ∈ {1, 2}
conv(BI
i
)=
L(i, xi, yi) ∈ R4+
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
yi ≥ 1,
10xi+ 2yi ≥ 30,
xi+ yi ≥ 7,
2xi+ 10yi ≥ 30,
xi ≥ 1
.
as presented in Figure 4-2.
It follows from Proposition 4.6 and the ensuing discussion that the convex hull of BI
has 25 nontrivial facet-defining inequalities and is represented by
conv(BI) =
(x, y) ∈ R2+ × R2
+
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
y1
515x1 + 1
15y1
17x1 +1
7y1
115x1 + 5
15y1
x1
+
y2
515x2 + 1
15y2
17x2 +1
7y2
115x2 + 5
15y2
x2
≥ 1
, (4–22)
where each pair of coefficients for (x1, y1) is matched with each pair of coefficients for
(x2, y2).
92
0 2 4 6 8 100
1
2
3
4
5
6
7
8
9
10
11
conv(BI
i)
xiyi ≥ 10
x
y
Figure 4-2. Facet-defining inequalities for conv(BI
i
)
Similarly, the convex hull characterization for a variety of bilinear sets can be
obtained using the result of Theorem 4.1. In particular, we study now the mixed integer
variant. We will study the continuous version in Proposition 4.9.
Proposition 4.7. Let
BM =
{(x, y) ∈ Zn
+ × Rn+
∣∣∣∣∣n∑
i=1
aixiyi ≥ r
}, (4–23)
where r > 0, and, for each i ∈ N , ai > 0. Define, for each i ∈ N ,
BMi =
{(x, y) ∈ BM
∣∣ (xj, yj) = (0, 0),∀j 6= i}.
Let the convex hull of BMi be represented by:
conv(BM
i
)=
{L(i, xi, yi) ∈ Rn
+ × Rn+
∣∣ lj(xi, yi) ≥ 1,∀j ∈ Ji},
where lj(xi, yi) is a linear function of (xi, yi). Then,
conv(BM
)=
{(x, y) ∈ Rn
+ × Rn+
∣∣∣∣∣n∑
i=1
lji(xi, yi) ≥ 1,∀(ji)ni=1 ∈n∏
i=1
Ji
}. (4–24)
93
Proof. Because the verification of the convex extension property is the only technical part
of the proof that is significantly different from that of BI , we only discuss the proof of this
property next. Because induction can be used, it suffices to prove the result when n = 2.
Let (x1, y1, x2, y2) ∈ BM . We show that there exist (i) subsets I and I ′ of {1, 2}, (ii) foreach i ∈ I, a point χi ∈ BM
i , and (iii) for each i ∈ I ′, a ray ψi of BMi , such that
(x1, y1, x2, y2) =∑i∈I
λiχi +∑
i∈I′µiψi, (4–25)
where the multipliers satisfy the following conditions: (a)∑
i∈I λi = 1, (b) for all i ∈ I,
λi ≥ 0, and (c) for all i ∈ I ′, µi ≥ 0.
We assume without loss of generality that x1y1 ≥ x2y2 since the pair of variables
(x1, y1) and (x2, y2) can be interchanged along with their respective coefficients a1 and
a2. Note that, if x2 = 0, it suffices to choose I = {1}, I ′ = {2}, χ1 = (x1, y1, 0, 0),
and ψ2 = (0, 0, 0, 1) to show that (4–14) holds. Similarly, if y2 = 0, it suffices to choose
I = {1}, I ′ = {2}, χ1 = (x1, y1, 0, 0), and ψ2 = (0, 0, 1, 0) to show that (4–14) holds.
When x1y1 ≥ x2y2 > 0, in addition to the positivity of x2 and y2, we may also assume
that x1 ≥ 1 and y1 > 0. Define χ1 =(x1, y1 +
a2x2y2a1x1
, 0, 0), χ2 =
(0, 0, x2, y2 +
a1x1y1a2x2
),
ψ1 = (x1, 0, 0, 0), and ψ2 = (0, 0, x2, 0). It can be easily verified that
(x1, y1, x2, y2) =a1x1y1
a1x1y1 + a2x2y2(χ1 + ψ2) +
a2x2y2a1x1y1 + a2x2y2
(χ2 + ψ1)
which shows that the convex extension property (4–14) holds.
Propositions 4.6 and 4.7 illustrate both the fact that the convex extension property
(4–14) holds in surprising settings and that this property might not always be trivial to
verify. We next present in Theorem 4.2 and Proposition 4.8 conditions under which the
convex extension property over orthogonal disjunctive sets can be shown to hold. These
conditions are satisfied by many polynomial covering inequalities [117] and, in particular,
the bilinear covering sets that are discussed in this section.
94
Theorem 4.2. Consider a function g(z1, . . . , zn) : R∑n
i=1 di+ 7→ R, where zi ∈ Rdi
+ , and the
set G ={z ∈ R
∑ni=1 di
+
∣∣∣ g(z1, . . . , zn) ≥ r}, where r > 0. Let Gi = G ∩ {
L(i, zi) | zi ∈ Rdi+
}
and gi(zi) = g(L(i, zi)). If there exist functions hi : Rdi+ 7→ R and f : Rn 7→ R such that:
(S1) g(z) ≤ f(h1(z1), . . . , hn(zn)), where f is a convex function,
(S2) f(y1) > f(y2) whenever y1 ≥ y2 and at least one component of y1 is larger than
the corresponding component of y2,
(S3) gi(zi) = f(L (i, hi(zi))),
(S4) For all i, hi(0) = 0 and, for λ ∈ (0, 1], λhi(ziλ) ≥ hi(zi), and
(S5) For all i, hi(zi) ≤ 0 implies that L(i, zi) ∈ 0+(cl convGi),
are satisfied over R∑n
i=1 di+ then the convex extension property, (4–14), holds for the set
G. Assume that, for each i ∈ {1, . . . , n}, conv(Gi) is closed. Define
G′i = conv(Gi) +
∑
i′ 6=i
0+(convGi′).
If, for all i, G′i ⊆ conv(G) then conv(G) is closed.
Proof. Let z ∈ G and y(z) = (h1(z1), . . . , hn(zn)). In the following, we sometimes
denote hi(zi) as yi(z) to emphasize that it is the ith component of y(z). Let T = {i |hi(zi) ≤ 0}. Then, by Assumption (S5), for each i ∈ T , L(i, zi) ∈ 0+(cl convGi). If
z − ∑i∈T L(i, zi) ∈ cl conv(
⋃ni=1Gi), then so does z. We now show that z′ = z −
∑i∈T L(i, zi) ∈ cl conv(
⋃ni=1Gi). Let δ be a subgradient of f at y(z′). Then, Assumption
(S2) implies that δ > 0. Otherwise, suppose that δi ≤ 0. Let ei denote the ith unit vector
and choose ε > 0. Observe that
f(y(z′)− εei) ≥ f(y(z′))− ε〈δ, ei〉 = f(y(z′))− εδi ≥ f(y(z′)),
a contradiction to Assumption (S2). Clearly, for each i 6∈ T , hi(z′i) = hi(zi). By
construction, for each i ∈ T , z′i = 0 and, therefore, hi(z′i) = 0 ≥ hi(zi). In other
words, yi(z′) = max{yi(z), 0}. Observe that Assumptions (S1) and (S2) together imply
that f(y(z′)) ≥ f(y(z)) ≥ g(z) ≥ r.
95
First, consider the case where 〈δ, y(z′)〉 = 0. Then, y(z′) = 0 since we have just
proven that y(z′) ≥ 0 and δ > 0. Observe now that y(z′) = 0 implies that z′ = 0. This
is because if hi(z′i) = 0, then hi(zi) ≤ 0. Therefore, i ∈ T and so z′i = 0. In other words,
g(z′) = g(0) = f(y(z′)) ≥ r, where the second equality follows from Assumption (S3). We
have thus shown that z′ = 0 ∈ Gi for each i. Clearly, z′ ∈ cl conv(⋃n
i=1Gi).
Now, consider the case when 〈δ, y(z′)〉 > 0. For i = 1, . . . , n, define λi = δiyi(z′)
〈δ,y(z′)〉 .
Since δi and yi(z′) are non-negative, it follows that λi ≥ 0. Further,
∑ni=1 λi = 1. Define
I = {i | λi > 0} and observe that |I| ≥ 1. The following chain of implication holds
i 6∈ I ⇒ yi(z′) = 0 ⇒ i ∈ T ⇒ z′i = 0,
where the first implication follows since δi > 0; the second because, for each i 6∈ T ,
yi(z′) > 0; and the third by the construction of z′. Therefore, z′ =
∑i∈I z
′′i , where
z′′i = L(i, z′i). For each i ∈ I, let χi =z′′iλi. Observe that z′ =
∑i∈I λiχi, i.e., z
′ can be
expressed as a convex combination of χi for i ∈ I. The following shows that, for all i ∈ I,
χi ∈ Gi:
g(χi) = gi
(z′iλi
)= f(y(χi)) ≥ f
(1
λi
y(z′′i ))
≥ f(y(z′)) + δi〈δ, y(z′)〉δiyi(z′)
yi(z′′)−
n∑j=1
δjyj(z′)
= f(y(z′)) + δi〈δ, y(z′)〉δiyi(z′)
yi(z′)−
n∑j=1
δjyj(z′)
= f(y(z′)) ≥ r.
The first equality follows from the definition of gi, the second equality from Assumption
(S3), the first inequality follows since f is non-decreasing by Assumption (S2) and
hi(z′iλi) ≥ 1
λihi(z
′i), the second inequality because δ is a subgradient of f at y(z′), and the
third equality because yi(z′′) = hi(z
′i) = yi(z
′). Since z = z′ +∑
i∈T L(i, zi), where, for each
i ∈ T , L(i, zi) ∈ 0+ (cl conv(Gi)) it follows that (4–14) holds for G.
96
We now prove the last statement of the theorem. Consider an arbitrary i ∈ N .
Clearly, G′i, as defined in the statement of the theorem, is convex. We argue that it
is also closed. By Corollary 9.1.1 in [102], G′i is closed if there do not exist L(i, zi) ∈
conv(Gi) and, for i′ ∈ N\{i}, L(0, zi′ , 0) ∈ 0+ (convGi′), not all zero, such that L(i, zi) +
∑i′∈N\{i} L(i′, zi′) = 0. But, the vectors L(i, zi) and L(i′, zi′) for i′ ∈ N\{i} are orthogonal.
Therefore, they sum to zero if and only if each of the vectors is zero. It follows that
G′i is closed. Again by Corollary 9.1.1 in [102], 0+(G′
i) =∑n
i=1 0+ (convGi). Since the
recession directions of G′i are independent of i, it follows by Corollary 9.8.1 in [102] that
conv (⋃n
i=1G′i) is closed. Now,
conv(G) ⊆ cl conv(G) = cl conv
(n⋃
i=1
Gi
)⊆ cl conv
(n⋃
i=1
G′i
)= conv
(n⋃
i=1
G′i
)⊆ conv(G),
where the first equality follows from the equivalence of (4–14) and (4–15), the second
containment follows since Gi ⊆ G′i, the second equality follows since conv (
⋃ni=1G
′i) is
closed and the third containment follows since G′i ⊆ conv(G).
The main challenge in applying Theorem 4.2 in practical situations is verifying
Assumption (S4). However, when hi(zi) is derived from other functions using operations
such as summations, minimizations, or maximizations, then Assumption (S4) can often be
established easily by studying the same properties for the functions used in the derivation
of hi(zi). To see this, first note that the assumption is satisfied trivially by any linear
function or, more generally, for a positively-homogeneous function of rth order, where r ≥1. For a more elaborate illustration, consider the case where h(z) = w (p1(z), . . . , pK(z)),
for all k ∈ {1, . . . , K}, pk(z) satisfies Assumption (S4), w satisfies Assumption (S4), w is
isotonic, i.e., w(y1) ≥ w(y2) if y1 ≥ y2, and w(0, . . . , 0) = 0. We claim that h(z) satisfies
Assumption (S4). Clearly, h(0) = w(p1(0), . . . , pK(0)) = w(0, . . . , 0) = 0 and
λh(zλ
)= λw
(p1
(zλ
), . . . , pK
(zλ
))≥ λw
(1
λp1(z), . . . ,
1
λpK(z)
)
≥ w (p1(z), . . . , pK(z)) = h(z),
97
where the first inequality follows since w is isotonic and pk(z) obeys Assumption (S4);
and the second inequality because w obeys Assumption (S4). If w satisfies Assumption
(S4) only over the non-negative orthant, then pk(z) must be non-negative as well. In
particular,∑K
k=1 pk(z) satisfies the assumption as long as, for all k, pk(z) satisfies the
assumption. For another illustration, consider now h(z) = opy p(y, z), where op is an
operator such as min or max that satisfies opy f1(y) ≥ opy f2(y) if, for all y, f1(y) ≥ f2(y)
and λ opy f(y) ≥ opy λf(y) for λ ∈ (0, 1]. In addition, assume that λp(y, z
λ
) ≥ p(y, z) for
λ ∈ (0, 1]. Then,
λh( zλ
)= λ op
yp(y,
z
λ
)≥ λ op
y
1
λp(y, z) ≥ op
yp(y, z) = h(z),
for λ ∈ (0, 1]. In particular, if h(z) = min(p1(z), . . . , pK(z)) and, for all λ ∈ (0, 1] and
k ∈ {1, . . . , K}, pk(z) ≤ λpk(zλ
)then h(z) ≤ λh
(zλ
).
The following corollary of Theorem 4.2 discusses the case where f is the summation
operator and hi(zi) = gi(zi). Subsequently, we will use Corollary 4.2 to show that the
convex extensions property holds for bilinear covering sets. In [117], we use this result to
show that the property also holds for more general polynomial covering sets. Moreover,
Corollary 4.2 shows that conv(G) is closed if the function g(·) eventually increases in each
one of the principal directions of the non-negative orthant.
Corollary 4.2. Consider a function g(z1, . . . , zn) : R∑n
i=1 di+ 7→ R, where zi ∈ Rdi
+ , and the
set G ={z ∈ R
∑ni=1 di
+
∣∣∣ g(z1, . . . , zn) ≥ r}, where r > 0. Let Gi = G ∩ {
L(i, zi) | zi ∈ Rdi+
}
and gi(zi) = g(L(i, zi)). If
(B1) g(z) ≤ ∑ni=1 gi(zi),
(B2) For all i, gi(0) = 0 and, for λ ∈ (0, 1], λgi(ziλ) ≥ gi(zi), and
(B3) For all i, gi(zi) ≤ 0 implies that L(i, zi) ∈ 0+(cl convGi),
are satisfied over R∑n
i=1 di+ then the convex extension property, (4–14), holds for the set
G. Let edi ∈ R∑n
j=1 dj whose d +∑
j<i dj indexed component is one and the rest are zero.
Assume that, for all i, conv(Gi) is closed. Assume further that there exists γ such that, for
98
all γ′ ≥ γ, i ∈ N , d ∈ {1, . . . , di}, and z ≥ 0, it holds that g(z + γ′edi
) ≥ g(z). Then,
conv(G) is closed.
Proof. Choose f to be the summation operator and hi(zi) = gi(zi). Then, the first part
of the result follows from Theorem 4.2. The rest of the result follows if G′i, as defined
in the statement of Theorem 4.2, is contained in conv(G). Consider a z which can be
expressed as zi +∑
i′ 6=i L(i′, zi′), where zi ∈ conv (Gi) and for all i′ 6= i, zi′ ≥ 0. By
Caratheodory’s theorem, there exist, for d ∈ {1, . . . , di + 1}, zd and λd ≥ 0, such that∑di+1
d=1 λdzd = zi,
∑di+1d=1 λd = 1, and zd ∈ Gi for all d. Let D =
∑i′ 6=i di′ . Then, define
m = min {zi′dD | i′ 6= i, d = 1, . . . , di′ , zi′d > 0} and m′ = max{1, γ
m
}. For each i′ 6= i
and d′ ∈ {1, . . . , di′}, define zdd′
i′ = zd + Dm′zi′d′ed′
i′ . On the one hand, for all (i′, d′) with
zi′d′ > 0, it follows that Dm′zi′d′ ≥ γ. Therefore, g(zdd
′i′
) ≥ g(zd) ≥ r and, so, zdd
′i′ ∈ G.
On the other hand, if zi′d′ = 0 then zdd′
i′ = zd ∈ G. It follows that zdd′
i′ ∈ G for all (i′, d, d′).
Now, z can be written as a convex combination of points in G as follows:
di+1∑
d=1
λd
(1− 1
m′
)zd + λd
1
Dm′∑
i′ 6=i
di′∑
d′=1
zdd′
i′
=
di+1∑
d=1
λdzd +
di+1∑
d=1
λd
∑
i′ 6=i
di′∑
d′=1
zi′d′ed′i′ = zi +
∑
i′ 6=i
L(i′, zi′).
Observe that the multipliers are non-negative since m′ ≥ 1 and
di+1∑
d=1
λd
1− 1
m′ +1
Dm′∑
i′ 6=i
di′∑
d=1
1
= 1.
Therefore, the result follows.
Theorem 4.1 also points to an interesting set of sufficient conditions that can be
used to verify the convex extension property. The primary difference from the conditions
in Theorem 4.2 is that Proposition 4.8 does not impose a structure on the original set
S. Instead, it constructs a set X whose projection in the z-space is contained within
cl conv (⋃n
i=1 Si), using a construction similar to Theorem 4.1, and then leaves it to the
99
user to verify that X outer-approximates S. This technique may be useful when S is
defined by more than one inequality. Also, note that the special case of Theorem 4.2,
discussed in Corollary 4.2, also follows from Proposition 4.8.
Proposition 4.8. For a set S and its subsets Si ⊆ S for i ∈ N = {1, . . . , n}, let zi ∈ Rdi
and z = (z1, . . . , zi, . . . , zn) ∈ S ⊆ R∑
i di. Assume that Assumptions (A1) and (A4)
are satisfied as in Theorem 4.1 and the sets Ai and X are as defined in (4–1) and (4–2)
respectively. If, in addition, the following assumptions are satisfied:
(N1) Si ⊆ projz Ai ⊆ cl conv(Si),
(N2) tjii , vkii , and wli
i are such that for all 0 < λ ≤ 1,
λtjii
(ziλ,ui
λ
)≥ tjii (zi, ui), λv
kii
(ziλ,ui
λ
)≥ vkii (zi, ui), λw
lii
(ziλ,ui
λ
)≥ wli
i (zi, ui),
(N3) S ⊆ cl conv(projz X).
Then, (4–14) holds for S.
Proof. Lemma 4.2 shows that X = proj(z,u)Q. We now show that projz X = projz Q ⊆cl conv (
⋃ni=1 Si). The proof is again similar to that for Lemma 4.1 except that the positive
homogeneity is replaced by the weaker inequalities assumed in Assumption (N2). Even
then, if (λ, z, u) ∈ Q and 0 < λi ≤ 1, it follows that(
ziλi, ui
λi
)∈ Ri(1) since the defining
inequalities are satisfied as follows:
tjii (zi, ui) ≥ λi and λitjii
(ziλi
,ui
λi
)≥ tjii (zi, ui)
⇒ λitjii
(ziλi
,ui
λi
)≥ tjii (zi, ui) ≥ λi
⇒ tjii
(ziλi
,ui
λi
)≥ 1.
Clearly, cl conv (⋃n
i=1 Si) ⊆ cl conv(S) and we have assumed that S ⊆ cl conv(projz X).
Observe that cl conv(S) ⊆ cl conv(projz X) ⊆ cl conv (⋃n
i=1 Si) ⊆ cl conv(S) and, therefore,
equality holds throughout.
100
Observe that Assumptions (N1) and (N2) are less restrictive than Assumption
(A3) in Theorem 4.1 since projz Ai may be a nonconvex subset of conv(Si) and the
positive homogeneity is relaxed. Here, it is not necessary to use tjii (zi, ui), vkii (zi, ui), and
wlii (zi, ui) as the underestimators in Assumption (N2). Rather, any function of (zi, ui) that
underestimates λitjii
(ziλi, ui
λi
), λiv
kii
(ziλi, ui
λi
), and λiw
lii
(ziλi, ui
λi
)for all λi ∈ (0, 1] suffices.
As long as the set Ci defined using these functions inner-approximates the recession
cone of cl conv(S), a suitable set X can be derived by projecting out the λ variables and
Assumption (N3) can be posed in terms of this set.
We now discuss the application of Corollary 4.2 to convexifying bilinear covering sets.
The bilinear covering sets that we shall now consider generalize the bilinear set discussed
in Proposition 4.4. In fact, the bilinear covering set reduces to Q, as defined in (4–12)
when restricted to any one of n orthogonal subspaces. As long as the convex extension
property holds, since Proposition 4.4 provides the defining inequality for the convex hull
in each of the orthogonal subspaces, we can use Theorem 4.1 to find the convex hull
description of the bilinear covering set over the non-negative orthant. We formalize this
argument in the following proposition.
Proposition 4.9. Consider the bilinear covering set:
BR =
{(x, y) ∈ Rn
+ × Rn+
∣∣∣n∑
i=1
(aixiyi + bixi + ciyi) ≥ r
}
where, for each i ∈ {1, . . . , n}, ai, bi, and ci are non-negative and r is strictly positive. Let
ηi(xi, yi) =1
2
(bixi + ciyi +
√(bixi + ciyi)2 + 4airxiyi
).
Then,
conv(BR) = X =
{(x, y) ∈ Rn
+ × Rn+
∣∣∣n∑
i=1
ηi(xi, yi) ≥ r
}. (4–26)
Proof. We may assume without loss of generality that, for each i, at least one of ai, bi,
or ci is positive. First, we use Corollary 4.2 to show that the convex extension property
(4–14) holds for BR. Let zi = (xi, yi) and gi(zi) = aixiyi + bixi + ciyi. Clearly, gi(0) = 0
101
and for 0 < λ ≤ 1,
λgi
(ziλ
)=
aixiyiλ
+ bixi + ciyi ≥ gi(zi).
Therefore, Assumption (B2) is satisfied. Let
BRi =
{L (i, xi, yi) ∈ R2n
+
∣∣ gi(xi, yi) ≥ 0}.
Observe that, if (x′i, y
′i) ≥ 0, then gi(xi+x′
i, yi+y′i) ≥ gi(xi, yi). Therefore, if z′i = (x′
i, y′i) ≥ 0
then (0, z′i, 0) ∈ 0+(cl convBRi ) and, consequently, Assumption (B3) is satisfied. It follows
that the convex extension property holds for BR. In fact, since g is non-decreasing and
cl conv(BRi ) = BR
i it follows from the last statement of Corollary 4.2 that conv(BR) is
closed as well. By Proposition 4.4, it follows that the convex hull of BRi is defined by
ηi(xi, yi) ≥ r. Observe that ηi(xi, yi) is a positively-homogeneous function. Therefore,
Assumption (A3) is satisfied. Finally, ηi(xi, yi) is concave by Proposition 4.2 and since for
sufficiently large zi, gi(xi, yi) ≥ r, it follows that BRi 6= ∅ and, therefore, by Proposition 4.1
that Assumption (A4) is satisfied. Then, by Theorem 4.1 and the discussion following
Definition 4.2, the set X in (4–26) is cl conv(BR). But, as argued earlier, cl conv(BR) =
conv(BR), and the result follows.
Consider the special case of Proposition 4.9 where bi = ci = 0. In this case, the convex
hull inequality takes the following simple form:∑n
i=1
√aixiyi ≥
√r. First, the validity of
the inequality can be verified using the following argument:
n∑i=1
√aixiyi ≥
√√√√n∑
i=1
aixiyi ≥√r,
where the first inequality follows by the subadditivity of square-root over non-negative
real numbers. Second, by Example 4.4, the above inequality defines the closure convex
hull of the disjunctive union of {(xi, yi) | aixiyi ≥ r} over the non-negative orthant and,
therefore, it must also be the closure convex hull of∑n
i=1 aixiyi ≥ r over the same set.
Note that we did not employ Theorem 4.2 in the argument. Instead, we replaced it with
102
a proof that the convex hull of the disjunctive union of orthogonal restrictions of the
set includes the original set. This illustrates a different technique, similar to the proof
technique of Proposition 4.8, that may sometimes be useful in establishing the convex
extension property.
However, the above technique for establishing validity fails for another special case of
Proposition 4.9, where the defining inequality is ax1y1 + bx2 ≥ r with a > 0, b > 0, and
r > 0. A simpler variant of this set was mentioned in the introduction of this section. By
Proposition 4.9, its convex hull over the non-negative orthant is defined by
√ax1y1r
+bx2
r≥ 1. (4–27)
Note that the right-hand-side r participates differently with different subsets of variables
in this convex hull inequality. One could use subadditivity of the square-root function to
instead derive the following valid inequality
√ax1y1r
+
√bx2
r≥ 1. (4–28)
However, as expected, (4–28) is not as tight as (4–27). This can be seen by considering a
point (x1, y1, x2) that is feasible to (4–27). If bx2
r≥ 1, it follows that
√bx2
r≥ 1. Otherwise,
bx2
r< 1, in which case
√ax1y1r
+
√bx2
r>
√ax1y1r
+bx2
r≥ 1.
Therefore, (x1, y1, x2) is feasible to (4–28) as well. Observe that the subadditivity of
the square-root function is not sufficient to prove the convex extension property for this
bilinear covering set, and, thus, cannot replace Theorem 4.2. Without realizing the convex
extension property a priori, even the form of the inequality (4–27) is not obvious. The key
to deriving this convex hull is thus to realize that the convex hull is formed by restricting
attention to orthogonal subspaces. The first subspace spans the (x1, y1) variables and the
second subspace spans x2. Then, Theorem 4.1 quickly reveals the structure of the convex
103
hull. Here,√
bx2
r≥ 1 as well as bx2
r≥ 1 define the convex hull of the set restricted to
(0, 0, x2). However, as the insight from Theorem 4.1 suggests, it is preferable to choose the
latter representation since it uses a positively-homogeneous function.
The construction of Proposition 4.9 can be carried out as long as it is possible to
invoke Theorem 4.2 to establish the convex extension property and Theorem 4.1 to
convexify the orthogonal disjunctions. This idea can be exploited to develop tighter
relaxations when the variables are restricted to belong to the hypercube by suitably
altering the inequality outside the hypercube so that Theorem 4.2 can still be used.
This technique for deriving relaxations is pursued in greater detail in [117]. Also, since
the relaxations developed arise from orthogonal disjunctions, their geometry is easy
to understand by studying the orthogonal subspaces. This idea is exploited in [117] to
show that the resulting relaxation for bilinear covering sets is tighter than the standard
factorable relaxation used in current nonlinear branch-and-bound solvers. A preliminary
computational study is also reported in [117] that shows that not only is the relaxation
guaranteed to be at least as tight as the factorable relaxation, but that the improvement is
substantial.
4.4 Concluding Remarks
In this chapter, we developed a convexification tool for orthogonal disjunctive sets.
The convexification is obtained in the space of original variables. As an application,
we provided a simple derivation of split cuts for mixed-integer polyhedral sets. We also
showed that the convexification tool is useful in deriving cuts for a variety of nonconvex
constraints; those that satisfy a key convex extension property. We provided a general set
of conditions that are sufficient to establish the convex extension property. We illustrated
the techniques by finding the convex hull representation for bilinear covering sets. The
convex extension property holds for a variety of polynomial covering sets and thereby our
convexification tool provides convex hull representations of many sets; see [117]. If the
variables are restricted to be in a hypercube, the results of this section motivate strategies
104
for exploiting this information that are often superior to their counterparts currently used
in most nonlinear branch-and-bound solvers; see [117].
105
CHAPTER 5LIFTED INEQUALITIES FOR 0-1 MIXED-INTEGER BILINEAR COVERING SETS 1
5.1 Introduction
In Chapter 4, we showed that for certain bilinear covering sets, relaxations stronger
than McCormick’s can be obtained by considering the right-hand-side. In particular, we
have derived closed-form expressions for the convex hull of∑
j∈N ajxjyj ≥ d over the
nonnegative orthant. However, these results are obtained under the assumption that
variables in (3–1) do not have upper bounds.
In this chapter, we study the convex hull of these sets when variables are bounded. In
particular, we consider 0−1 mixed-integer bilinear covering sets of the form
B =
{(x, y) ∈ {0, 1}n × [0, 1]n
∣∣∣n∑
j=1
ajxjyj ≥ d
},
where n ∈ Z+, aj > 0 ∀j ∈ N := {1, . . . , n}, and d > 0. In order to guarantee that B is not
empty, we impose that
Assumption 5.1.∑n
j=1 aj ≥ d.
On the theoretical side, we are interested in studying relaxation techniques for B that
will take both the right-hand-side d and upper bounds on the variables into account. On
the practical side, we are also interested in studying B because of its relations with some
important mixed-integer linear sets. In particular, since the set B is a relaxation of the
single-node flow set without outflows
F =
{(x, y) ∈ {0, 1}n × [0, 1]n
∣∣∣n∑
j=1
ajyj ≥ d, xj ≥ yj ∀j ∈ N
},
valid inequalities for B will also be valid for F . Further, we will show that the derivation
of strong linear inequalities valid for B yields new families of facet-defining inequalities for
the convex hull of F , thereby providing new approaches to study these sets.
1 The material of this chapter is based on [34].
106
We next show that it will typically be difficult to find globally optimal solutions to
problems containing B as a constraint by showing that it is NP-hard to optimize a linear
function over B. This also suggests that finding a closed-form expression for the convex
hull of B is likely to be difficult. To this end, consider the following optimization problem
(P ) that seeks to minimize a linear objective function over the bilinear set B:
(P ) minn∑
j=1
bjxj +n∑
j=1
cjyj
s.t. (x, y) ∈ B,
where b ∈ Rn and c ∈ Rn. We show next that (P ) is NP-hard.
Proposition 5.1. Problem (P ) is NP-hard.
Proof. The proof is by reduction from the 0−1 knapsack problem. Consider the problem
(K) defined as:
min
{n∑
j=1
bjxj
∣∣∣n∑
j=1
ajxj ≥ d, xj ∈ {0, 1} ∀j ∈ N
}.
Problem (K) is an instance of the 0−1 knapsack problem that is proven to be NP-hard
in [56]. Now, we show that (K) is polynomially reducible to (P ). Consider an instance of
(K). We define a corresponding instance of (P ) by setting cj = 0 for all j ∈ N . Then, (P )
can be rewritten as:
min
{n∑
j=1
bjxj
∣∣∣n∑
j=1
ajxjyj ≥ d, xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ N
}.
Clearly, there exists an optimal solution (x∗, y∗) of (P ) with y∗j = 1 for all j ∈ N
since the corresponding objective coefficients are zero. Using the fact that∑n
j=1 ajx∗j ≥
∑nj=1 ajx
∗jy
∗j ≥ d, it can easily be verified that x∗ is an optimal solution for (P ). Therefore,
(K) is polynomially reducible to (P ), which proves the result.
In this chapter, we focus on constructing strong cutting planes for optimization
problems containing the constraints of B by studying the convex hull of B. Throughout
107
the chapter, we will denote conv(B) by PB. Because B can be expressed as a finite union
of polytopes, PB is polyhedral.
Proposition 5.2. PB is a polytope.
Therefore, when studying PB, it is sufficient to consider linear inequalities. To
construct these inequalities, we will use lifting. Lifting is a well-known integer programming
technique that generates strong inequalities by transforming an inequality valid for
a restricted subset of the feasible region into a globally valid constraint. Early work
on lifting in integer programming can be found in Wolsey [131, 132]. A generalization
to nonlinear programming can be found in Richard and Tawarmalani [100]. Using
sequence-independent lifting techniques, we derive large families of facet-defining
inequalities, which can be used as strong cutting planes in branch-and-cut framework.
This illustrates a new way of using the bounds on variables in the generation of cuts
in MINLP. Further, the results have implications for flow models in mixed-integer
programming, a family of problems that are important both theoretically and practically.
This chapter is structured as follows. In Section 5.2, we derive basic polyhedral results
about PB. We provide necessary and sufficient conditions for some trivial inequalities
to be facet-defining. Then, we derive a linear description of PB for the special case
where n = 2. This result is used to identify the seed inequalities that will be used in
lifting procedures. In Section 5.3, we derive three families of closed-form facet-defining
inequalities for PB using sequence-independent lifting techniques. One requires the use
of a subadditive approximation of the lifting function. In Section 5.4, we show that the
lifted inequalities developed for PB generalize known families of cuts and yield new
facet-defining inequalities for the single-node flow set F without outflows. We summarize
the contributions of our work and conclude with remarks on future research directions in
Section 5.5.
108
5.2 Basic Polyhedral Results
In this section, we develop basic results about the polyhedral structure of PB. First,
we provide necessary and sufficient conditions for PB to be full-dimensional.
Proposition 5.3. PB is a full-dimensional polytope if and only if∑n
j=1 aj ≥ d+ ai for all
i ∈ N .
Proof. First, we show that if∑n
j=1 aj ≥ d + ai for all i ∈ N , then PB is a full-dimensional
polytope. For all i ∈ N , construct pi = (1 − ei,1) and qi = (1,1 − ei). Further,
select r = (1,1). It follows that pi, qi, and r belong to B. Further, these points are
affinely independent because r − pi and r − qi for all i ∈ N are linearly independent.
Since we have described 2n + 1 affinely independent points in PB, we have shown that
PB is full-dimensional. Next, we prove if PB is a full-dimensional polyhedron, then∑n
j=1 aj ≥ d + ai for all i ∈ N . Assume by contradiction that∑n
j=1 aj < d + ai for
some i ∈ N . Since∑n
j=1 aj ≥ d from Assumption 5.1, we must have that xi = 1 in
every feasible solution of B, showing that PB is not full-dimensional. This is the desired
contradiction.
In the remaining of this chapter, we will assume that PB is full-dimensional.
Assumption 5.2.∑n
j=1 aj ≥ d+ ai for all i ∈ N .
Observe that Assumption 5.2 strictly dominates Assumption 5.1. We next identify
some basic properties that all facets of PB must satisfy.
Proposition 5.4. Letn∑
j=1
αjxj +n∑
j=1
βjyj ≥ δ (5–1)
be a facet-defining inequality for PB that is not a scalar multiple of xj ≤ 1 for j ∈ N or
yj ≤ 1 for j ∈ N . Then, αj ≥ 0, ∀j ∈ N , βj ≥ 0, ∀j ∈ N , and δ ≥ 0.
Proof. Select an arbitrary element i ∈ N . Since (5–1) is facet-defining for PB, there exists
(x∗, y∗) ∈ B such thatn∑
j=1
αjx∗j +
n∑j=1
βjy∗j = δ, (5–2)
109
Since (5–1) is not a scalar multiple of xi ≤ 1, it is clear that x∗i < 1. Consider now
(x, y) = (x∗, y∗) + (1− x∗i )ei. This point belongs to B and therefore, satisfies (5–1), i.e.,
n∑j=1
αjxj +n∑
j=1
βj yj ≥ δ. (5–3)
Subtracting (5–2) from (5–3), we obtain that αi ≥ 0. The proof that βi ≥ 0 for all
i ∈ N is similar. The fact that δ ≥ 0 follows from (5–2) after noting that all terms in the
left-hand-side are nonnegative.
The following proposition further studies facet-defining inequalities whose right-hand-sides
are zero.
Proposition 5.5. Letn∑
j=1
αjxj +n∑
j=1
βjyj ≥ 0 (5–4)
be a facet-defining inequality for PB. Then, (5–4) is a scalar multiple of xj ≥ 0 for j ∈ N
or of yj ≥ 0 for j ∈ N .
Proof. Assume for a contradiction that (5–4) is not a scalar multiple of xi ≥ 0 for i ∈ N or
of yk ≥ 0 for k ∈ N . Select i ∈ N . Then, there exists (xi, yi) ∈ B such that xii > 0 and for
whichn∑
j=1
αjxij +
n∑j=1
βjyij = 0. (5–5)
Because (5–4) is not a scalar multiple of xi ≥ 0 for i ∈ N or of yk ≥ 0 for k ∈ N as its
right-hand-side is equal to 0, it follows from Proposition 5.4 that αj ≥ 0 and βj ≥ 0 for all
j ∈ N . We obtain that
0 =n∑
j=1
αjxij +
n∑j=1
βjyij ≥ αix
ii ≥ 0. (5–6)
We conclude that αi = 0 since xii > 0. Similarly, we can establish that βk = 0 ∀k ∈ N .
This is a desired contradiction to the fact that (5–4) is facet-defining for PB.
Now, we characterize some simple facets of PB that play an important role in
Propositions 5.4 and 5.5.
110
Proposition 5.6. The upper bound inequalities xi ≤ 1, yi ≤ 1 are facet-defining for PB
for all i ∈ N . Further, for i ∈ N , the lower bound inequalities xi ≥ 0, yi ≥ 0 are facet-
defining for PB if and only if∑n
j=1 aj−ai−al(i) ≥ d where l(i) ∈ argmax{aj | j ∈ N \{i}}.
Proof. The validity of all these inequalities is trivial since they belong to the description
of B. To prove that xi ≤ 1 is facet-defining, we show 2n affinely independent points in B
satisfying xi = 1. First, we construct the n points pk = (1,1 − ek) for k ∈ N . Next, we
build the n− 1 points qk = (1− ek,1) for k ∈ N \ {i}. Finally, we select r = (1,1). These
points are affinely independent since r − pk and r − qk are linearly independent. Therefore,
xi ≤ 1 is facet-defining for PB. The proof that yi ≤ 1 is facet-defining for PB is similar.
Now, we show that xi ≥ 0 is facet-defining if∑n
j=1 aj − ai − al(i) ≥ d by constructing
2n affinely independent points in B satisfying xi = 0. For k ∈ N \ {i}, we construct the
2(n−1) points, pk = (1−ei−ek,1−ei−ek) and qk = (1−ei−ek,1−ei). Finally, we add the
two points r1 = (1− ei,1− ei) and r2 = (1− ei,1). Clearly, for any k ∈ N \ {i}, pk, qk, r1
and r2 are feasible since∑n
j=1 aj − ai − ak ≥∑n
j=1 aj − ai − al(i) ≥ d for k ∈ N \ {i}. Thesepoints are affinely independent since qk − pk, r1 − qk, and r2 − r1 are linearly independent.
To prove the reverse direction, assume now that xi ≥ 0 is facet-defining for PB. We claim
that∑n
j=1 aj − ai − al(i) ≥ d. Assume for a contradiction that∑n
j=1 aj − ai − al(i) < d. It
follows that xi + xl(i) ≥ 1 is valid for PB. Since −xl(i) ≥ −1, we obtain that xi ≥ 0 is a
convex combination of two other valid inequalities, showing it is not facet-defining. This
is a contradiction to the fact that xi ≥ 0 is facet-defining. Similarly, it can be proven that
yi ≥ 0 is facet-defining for PB if∑n
j=1 aj − ai − al(i) ≥ d.
Observe that the above proofs are also valid for yi ∈ {0, 1} instead of yi ∈ [0, 1] for
some subset J ⊆ N . We next study another simple facet-defining inequality for PB.
Proposition 5.7. The inequality∑n
j=1 ajyj ≥ d is facet-defining for PB.
Proof. Validity is easily verified since∑n
j=1 ajyj ≥ ∑nj=1 ajxjyj ≥ d. To prove that
∑nj=1 ajyj ≥ d is facet-defining, we present 2n points (xi, yi) in B that satisfy
∑nj=1 ajyj ≥
111
d at equality and such that the system αxi + βyi = δ for i = 1, . . . , 2n only has solutions
(α, β, δ) that are scalar multiples of (0, a, d). Consider the 2n points pk = (1,∆k(1 − ek))
and qk = (1 − ek,∆k(1 − ek)) where ∆k = d∑nj=1 aj−ak
for k ∈ N . Note that because of
Assumption 5.2, 0 < ∆k ≤ 1 for all k ∈ N . Clearly, pk and qk belong to B and satisfy∑n
j=1 ajyj ≥ d at equality. These 2n points yield the system:
n∑j=1
αj +∆k
(n∑
j=1
βj − βk
)= δ ∀k ∈ N, (5–7)
n∑j=1
αj − αk +∆k
(n∑
j=1
βj − βk
)= δ ∀k ∈ N. (5–8)
By subtracting (5–7) from (5–8), we obtain that αk = 0 for k ∈ N . From (5–7), we then
conclude that, for all k, l ∈ N ,
n∑j=1
βj − βk =δ
d
(n∑
j=1
aj − ak
)and
n∑j=1
βj − βl =δ
d
(n∑
j=1
aj − al
).
This implies that βk − δdak = βl − δ
dal. After letting βk − δ
dak = θ and plugging these values
in (5–7), we obtain that θ = 0, which implies βk = δdak for k ∈ N . Therefore, we conclude
that all solutions (α, β, δ) to (5–7) and (5–8) are scalar multiples of (0, a, d).
In the remainder of the text, we will often use the term facet to refer to a facet-defining
inequality. We will also refer to inequalities xi ≤ 1, yi ≤ 1, and∑n
j=1 ajyj ≥ d as trivial
facets of PB. To illustrate the richness of the polyhedral structure of PB, we present a
simple example next. The linear inequalities describing the convex hull of this set were
obtained using PORTA; see Christof and Lobel [32].
Example 5.1. Consider the 0−1 mixed-integer bilinear covering set
B ={(x, y) ∈ {0, 1}4 × [0, 1]4
∣∣∣ 19x1y1 + 17x2y2 + 15x3y3 + 10x4y4 ≥ 20}.
The linear description of PB has 58 inequalities that are presented in the Appendix. A
subset of the inequalities are:
50x1 +90x3 +45x4 +76y1 +153y2 ≥ 135 (5–9)
112
70x1 +90x2 +27x4 +38y1 +135y3 ≥ 117 (5–10)
19x1 +17x2 +15y3 +10y4 ≥ 20 (5–11)
17x2 +15x3 +19y1 +10y4 ≥ 20 (5–12)
19y1 +17y2 +15y3 +10y4 ≥ 20 (5–13)
14x1 +10x3 +5x4 +17y2 ≥ 15 (5–14)
12x2 +10x3 +5x4 +19y1 ≥ 15 (5–15)
10x3 +5x4 +19y1 +17y2 ≥ 15 (5–16)
x1 +x2 +x3 +10y4 ≥ 2 (5–17)
x1 +x2 +x3 +x4 ≥ 2 (5–18)
x1 ≥ 0 (5–19)
y1 ≥ 0 (5–20)
x1 ≤ 1 (5–21)
y1 ≤ 1 (5–22)
Among the inequalities in Example 5.1, we observe the upper bound inequalities
(5–21) and (5–22) that we know are facet-defining for PB because of Proposition 5.6. In
this example, the lower bound inequalities (5–19) and (5–20) are also facet-defining, as can
be established from Proposition 5.6. Finally, (5–13) is the trivial facet-defining inequality,
studied in Proposition 5.7. Our goal is now to discover families of valid inequalities for PB
that would explain (5–9)-(5–12) and (5–14)-(5–18).
To derive these nontrivial facet-defining inequalities, we first study the convex hull
of B when n = 2 with the goal of identifying seed inequalities for subsequent lifting
procedure. We next show in Proposition 5.8 that PB has three nontrivial facets when
n = 2. We assume in this study that a1 ≥ d and a2 ≥ d since otherwise PB is not
full-dimensional and its polyhedral structure is trivial.
Proposition 5.8. Let
B2 ={(x, y) ∈ {0, 1}2 × [0, 1]2
∣∣∣ a1x1y1 + a2x2y2 ≥ d},
113
where a1 ≥ d, a2 ≥ d and d > 0. Then,
conv(B2) = X :=
(x, y) ∈ [0, 1]2 × [0, 1]2
∣∣∣∣∣∣∣∣∣∣∣∣∣
x1+ x2 ≥ 1
dx1+ a2y2 ≥ d
a1y1+ dx2 ≥ d
a1y1+ a2y2 ≥ d
.
Proof. We prove the result using disjunctive programming techniques; see [17]. We define
X10 := B2 ∩ {x1 = 1, x2 = 0} = {(1, y1, 0, y2) | da1
≤ y1 ≤ 1, 0 ≤ y2 ≤ 1},X01 := B2 ∩ {x1 = 0, x2 = 1} = {(0, y1, 1, y2) | 0 ≤ y1 ≤ 1, d
a2≤ y2 ≤ 1},
X11 := B2 ∩ {x1 = 1, x2 = 1} = {(1, y1, 1, y2) | a1y1 + a2y2 ≥ d, 0 ≤ y1 ≤ 1, 0 ≤ y2 ≤ 1}.
It is easily verified that conv(B2) = conv(X10 ∪ X01 ∪ X11) = conv(X2 ∪ X11) where
X2 := conv(X10 ∪X01). We first use disjunctive programming techniques to obtain a linear
description of X2 and then compute conv(B2) as conv(X2 ∪ X11). Using Theorem 2.1 in
Balas [17], we write
X2 = proj(x,y)
(x1, y1, x2, y2, z1, z2, z1, z2, λ)
∣∣∣∣∣∣∣∣∣∣∣∣∣∣
(x1, y1, x2, y2) = (λ, z1 + z1, 1− λ, z2 + z2),
d
a1λ ≤ z1 ≤ λ, 0 ≤ z2 ≤ λ,
0 ≤ z1 ≤ 1− λ,d
a2(1− λ) ≤ z2 ≤ 1− λ,
0 ≤ λ ≤ 1
.
We then use Fourier-Motzkin elimination [141] to compute the projection. We first
eliminate the variables λ, z1 and z2 using the equations λ = x1, z1 = y1 − z1, and
z2 = y2 − z2. We then project the variables z1 and z2 from the system
x1 + x2 = 1, x1 ≥ 0,
114
andda1x1 ≤ z1 ≤ x1,
x1 + y1 − 1 ≤ z1 ≤ y1,
0 ≤ z2 ≤ 1− x2,
y2 − x2 ≤ z2 ≤ y2 − da2x2,
to obtain
X2 = conv(X10 ∪X01) =
(x1, y1, x2, y2)
∣∣∣∣∣∣∣
x1 + x2 = 1, x1 ≥ 0, x2 ≥ 0,
d
a1x1 ≤ y1 ≤ 1,
d
a2x2 ≤ y2 ≤ 1
.
Now, compute conv(X2 ∪X11) as
proj(x,y)
(x1, y1, x2, y2,
u1, u2, v1, v2, λ)
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
(x1, y1, x2, y2) = (u1 + (1− λ), v1 + v1, u2 + (1− λ), v2 + v2),
u1 + u2 = λ, u1 ≥ 0, u2 ≥ 0,
d
a1u1 ≤ v1 ≤ λ,
d
a2u2 ≤ v2 ≤ λ,
0 ≤ v1 ≤ 1− λ, 0 ≤ v2 ≤ 1− λ,
a1v1 + a2v2 ≥ d(1− λ), 0 ≤ λ ≤ 1
.
We again obtain the projection using Fourier-Motzkin elimination. Using the equations
x1 = u1 + 1 − λ, x2 = u2 + 1 − λ, and u1 + u2 = λ, we obtain that λ = 2 − (x1 + x2),
u1 = 1 − x2, and u2 = 1 − x1. Using these relations together with v1 = y1 − v1 and
v2 = y2 − v2 to eliminate the corresponding variables, we obtain that
x1 ≤ 1, x2 ≤ 1, 1 ≤ x1 + x2 ≤∗ 2,
and
x1 + x2 − 2 ≤∗ v1 ≤ y1 − da1(1− x2),
0 ≤ v1 ≤ x1 + x2 − 1,
−a2a1v2 +
da1(x1 + x2 − 1) ≤ v1,
x1 + x2 − 2 ≤∗ v2 ≤ y2 − da2(1− x1),
0 ≤ v2 ≤ x1 + x2 − 1,
115
where inequalities ≤∗ can be verified to be redundant. Projecting v1, we obtain that
x1 ≤ 1, x2 ≤ 1, 1 ≤ x1 + x2,
da1(1− x2) ≤ y1, x1 + x2 ≥∗ 1,
andda2x1 − a1
a2y1 ≤ v2,
(d−a1)a2
(x1 + x2 − 1) ≤∗ v2,
v2 ≤ y2 − da2(1− x1),
0 ≤ v2 ≤ x1 + x2 − 1,
Again, inequalities ≥∗ and ≤∗ are clearly redundant since x1 + x2 ≥ 1 and a1 ≥ d.
Projecting v2, we obtain that
x1 ≤ 1, x2 ≤ 1, 1 ≤ x1 + x2,
a1y1 + dx2 ≥ d,
dx1 + a2y2 ≥ d,
a1y1 + a2y2 ≥ d,
x1 + x2 ≥∗ 1,
(a2 − d)x1 + a2x2 + a1y1 ≥ a2. (R)
Clearly, inequality ≥∗ is repeated. (R) is also redundant since
(a2 − d)x1 + a2x2 + a1y1 − a2 = a2(x1 + x2 − 1)− dx1 + a1y1
≥ a2(x1 + x2 − 1)− dx1 + d(1− x2)
= (a2 − d)(x1 + x2 − 1) ≥ 0,
where the first inequality holds since a1y1 + dx2 ≥ d. Therefore, conv(X2 ∪ X11) has the
four inequalities of X, concluding the proof.
Next, we give generalizations of the nontrivial facets of conv(B2) that we prove are
facet-defining for more general instances of conv(B). In particular, we generalize inequality
116
dx1 + a2y2 ≥ d in Proposition 5.9 and inequality x1 + x2 ≥ 1 in Proposition 5.11. We will
use these generalizations as seed inequalities for lifting procedures in Section 5.3.
Proposition 5.9. Let L ⊆ {j ∈ N | aj > d}. Assume that∑
j∈N\L aj > d. Then,
∑j∈L
dxj +∑
j∈N\Lajyj ≥ d (5–23)
is facet-defining for PB.
Proof. If L = ∅, the result follows from Proposition 5.7. Hence, we assume that L 6= ∅.First, we show that (5–23) is valid for B. Assume for a contradiction that there exists
(x′, y′) ∈ B such that∑
j∈L dx′j +
∑j∈N\L ajy
′j < d. Then, x′
j = 0 for all j ∈ L. It follows
that d >∑
j∈N\L ajy′j ≥
∑j∈N\L ajx
′jy
′j =
∑j∈N ajx
′jy
′j, which is a contradiction to the fact
that (x′, y′) ∈ B.
Next, we prove that (5–23) is facet-defining for PB by providing 2n points (xi, yi)
in B that satisfy (5.9) at equality such that all solutions (α, β, δ) to αxi + βyi = δ for
i = 1, . . . , 2n yield inequalities αx + βy ≥ δ that are scalar multiples of (5–23). Consider
the 2|L| points pl = (el, el) and pl = (el, (1− ε)el) for l ∈ L where ε > 0 is sufficiently small.
It is clear that pl, pl belong to B for all l ∈ L, and that they satisfy (5–23) at equality.
From pl and pl, we obtain that αl + βl = δ and αl + (1 − ε)βl = δ, which implies that
αl = δ and βl = 0 for all l ∈ L. Next, we select an arbitrary element l ∈ L 6= ∅. Fork ∈ N \ L = {k1, . . . , kn−|L|}, construct the n − |L| points qk = (el + ek, el). Finally,
define d = d∑j∈N\L aj
and construct the n − |L| points qk1 =(∑
j∈N\L ej, d∑
j∈N\L ej)and
qki =(∑
j∈N\L ej, d∑
j∈N\L ej + ε( 1aki−1
eki−1− 1
akieki)
)for i = 2, . . . , n − |L| where ε is
sufficiently small. It can be verified that qk, qk belong to B for k ∈ N \ L since 0 < d < 1
and ε is small. Further, these points satisfy (5–23) at equality. From qk for k ∈ N \ L, weobtain that αl + αk + βl = δ, which implies that αk = 0 for all k ∈ N \ L since αl = δ and
βl = 0. Further, using the points qk for k ∈ N \ L, we obtain the system of equations:
∑
k∈N\Lαk + d
∑
k∈N\Lβk = δ, (5–24)
117
∑
k∈N\Lαk + d
∑
k∈N\Lβk + ε
(βki−1
aki−1
− βki
aki
)= δ, i = 2, . . . , n− |L|, (5–25)
which implies that there exists θ s.t.βki−1
aki−1=
βki
aki= θ for i = 2, . . . , n − |L|. Plugging
these expressions into (5–24), we obtain that βk = δdak for all k ∈ N \ L. Therefore, we
conclude that αl =δdd for l ∈ L and βk = δ
dak for k ∈ N \ L, which proves that (5–23) is
facet-defining.
Note that in Example 5.1, the inequality (5–13) can be obtained using Proposition 5.9
with L = ∅. In the remainder of the chapter, we use the following notation extensively. For
N0, N1 ⊆ N such that N0 ∩N1 = ∅ and N0, N1 ⊆ N such that N0 ∩ N1 = ∅, we let
B(N0, N1, N0, N1) :=
(x, y) ∈ B
∣∣∣∣∣∣∣xj = 0 for j ∈ N0, xj = 1 for j ∈ N1,
yj = 0 for j ∈ N0, yj = 1 for j ∈ N1
.
We also define PB(N0, N1, N0, N1) := conv(B(N0, N1, N0, N1)). In particular, B(∅, ∅, ∅, N)
is the 0−1 knapsack set
B(∅, ∅, ∅, N) =
{x ∈ {0, 1}n
∣∣∣n∑
j=1
ajxj ≥ d
},
whose polyhedral structure was first studied by Balas [14], Wolsey [130], and Hammer
et al. [63]. We next show some relations between the bilinear set B and the 0−1 knapsack
set B(∅, ∅, ∅, N).
Proposition 5.10. Let∑j∈N
αjxj +∑j∈I
βjyj ≥ δ (5–26)
be a facet-defining inequality for PB(∅, ∅, ∅, N \ I) that is not a bound. Then, (5–26) is
facet-defining for PB(∅, ∅, ∅, N \ I) if and only if (5–26) is facet-defining for PB.
Proof. We first prove that if (5–26) is facet-defining for PB(∅, ∅, ∅, N \ I), then (5–26) is
facet-defining for PB. To show that (5–26) is valid for B, we assume for a contradiction
that there exists a point (x′, y′) ∈ B with∑
j∈N αjx′j +
∑j∈I βjy
′j < δ. Since (x′, y′) ∈ B,
118
we have that∑
j∈N ajx′jy
′j ≥ d. Next, we define (x, y) as x = x′, yj = y′j for j ∈ I,
and yj = 1 for j ∈ N \ I. Observe that (x, y) ∈ B(∅, ∅, ∅, N \ I) as∑
j∈I ajxj yj +∑
j∈N\I ajxj ≥∑
j∈N ajx′jy
′j ≥ d. Since (5–26) is valid for B(∅, ∅, ∅, N \ I), (x, y) satisfies
∑j∈N αjx
′j +
∑j∈I βjy
′j =
∑j∈N αjxj +
∑j∈I βj yj ≥ δ. This is the desired contradiction.
Next, we show that (5–26) is facet-defining for PB. Since (5–26) is facet-defining
for PB(∅, ∅, ∅, N \ I) and δ 6= 0 as (5–26) is not a bound, there exist n + |I| linearlyindependent points in PB(∅, ∅, ∅, N \ I) that satisfy (5–26) at equality. Let (xk, yk) be
these points. Clearly, (xk, yk) for k = 1, . . . , n + |I| belong to B and satisfy (5–26) at
equality. Now, for each j ∈ N \ I, we construct one new point in B \ B(∅, ∅, ∅, N \ I) thatsatisfies (5–26) at equality. Since (5–26) is not a bound, there exists kj for all j ∈ N \ I
such that xkjj = 0, but xk
j = 1 for some k 6= kj. For each j ∈ N \ I, pick (xkj , ykj) and
define a new point (xkj , ykj) such that xkji = x
kji ∀i ∈ N , y
kji = y
kji ∀i ∈ N \ {j} and
ykjj = 0. Clearly, (xkj , ykj) belongs to B and satisfies (5–26) at equality. Further, it is
easily seen that together with (xk, yk) all (xkj , ykj) are linearly independent and therefore
show that (5–26) is facet-defining for PB.
To prove the reverse implication, we assume that (5–26) is a nontrivial facet-defining
inequality for PB. Validity is trivial since for B(∅, ∅, ∅, N \ I) ⊆ B. Now, we show that
(5–26) is facet-defining for PB(∅, ∅, ∅, N \ I). Since δ 6= 0 as (5–26) is not a bound, the
set of 2n affinely independent points (xk, yk) for k = 1, . . . , 2n in B that satisfy (5–26) at
equality are also linearly independent. Therefore,
∣∣∣∣∣∣∣∣∣∣∣∣∣
x11 . . . x1
n y11 . . . y1n
x21 . . . x2
n y21 . . . y2n
. . . . . .
x2n1 . . . x2n
n y2n1 . . . y2nn
∣∣∣∣∣∣∣∣∣∣∣∣∣
6= 0.
119
It can be verified that there exist n + |I| rows i1, . . . , in+|I| where I = {j1, . . . , j|I|} such
that ∣∣∣∣∣∣∣∣∣∣∣∣∣
xi11 . . . xi1
n yi1j1 . . . yi1j|I|
xi21 . . . xi2
n yi2j1 . . . yi2j|I|
. . . . . .
xin+|I|1 . . . x
in+|I|n y
in+|I|j1
. . . yin+|I|j|I|
∣∣∣∣∣∣∣∣∣∣∣∣∣
6= 0.
Hence, we see that n + |I| points (xik1 , . . . , x
ikn ; y
ikj1, . . . , yikj|I|) for k = 1, . . . , n + |I| are
linearly independent. Now, define the points (xik , yik) for k = 1, . . . , n + |I| such that
xik = xik , yikj = yikj for j ∈ I, and yikj = 1 for j ∈ N \ I. The points (xik , yik) are feasible
to B(∅, ∅, ∅, N \ I) and satisfy (5–26) at equality. Therefore, we conclude that (5–26) is
facet-defining for PB(∅, ∅, ∅, N \ I).
Observe that Proposition 5.10 implies that all nontrivial facets of the 0−1 knapsack
polytope can be found in B and that it is sufficient to study the facets of B to know
the facets of the 0−1 knapsack polytope. Next, we use Proposition 5.10 to generalize
inequality x1 + x2 ≥ 1 in Proposition 5.8 into an inequality that we will use as a seed for
lifting procedures in Section 5.3.3.
Proposition 5.11. Assume that∑
j∈N aj − ak − am < d for all k,m ∈ N with k 6= m. The
cover inequality∑j∈N
xj ≥ |N | − 1 (5–27)
is facet-defining for PB.
Proof. Because of Proposition 5.10. it is sufficient to prove that (5–27) is facet-defining
for PB(∅, ∅, ∅, N). To prove validity, assume for a contradiction that there exists x′ ∈B(∅, ∅, ∅, N) such that
∑j∈N ajx
′j ≥ d and
∑j∈N x′
j ≤ |N | − 2. Since∑
j∈N x′j ≤ |N | − 2,
there exist k,m ∈ N with k 6= m such that x′k = 0 and x′
m = 0. Therefore,∑
j∈N aj − ak −am ≥ ∑
j∈N ajx′j ≥ d. This contradicts the assumption that
∑j∈N aj − ak − am < d for all
k,m ∈ N with k 6= m. We next show that (5–27) is facet-defining for PB(∅, ∅, ∅, N). It can
120
be easily verified using Assumption 5.2 that the points pk = (1− ek,1) for k ∈ N belong to
B(∅, ∅, ∅, N). Since these points are linearly independent and satisfy (5–27) at equality, we
conclude that (5–27) is facet-defining for PB(∅, ∅, ∅, N).
5.3 Lifted Inequalities
In this section, we derive three families of strong valid inequalities for PB via lifting.
The first two families are obtained using sequence-independent lifting from (5–23) and are
facet-defining for PB. In this case, lifting is simple since the lifting function is subadditive.
The third inequality is obtained by lifting (5–27). Although the lifting function associated
with this seed inequality is not subadditive, we obtain strong lifted inequalities using
approximate lifting. We also identify conditions under which these lifted inequalities are
facet-defining for PB.
5.3.1 Sequence-Independent Lifting for Bilinear Covering Sets
Sequence-independent lifting is a well-known technique to construct strong valid
inequalities for mixed-integer linear programs; see Wolsey [132] and Gu et al. [62]. We
next give a brief description of how the technique can be used to derive strong valid
inequalities for PB. A more general treatment of lifting in nonlinear programming is given
in Richard and Tawarmalani [100].
Given ∅ 6= S ( N , consider B(S, ∅, S, ∅), which is the restriction of B obtained when
all variables (xj, yj) for j ∈ S are fixed to (0, 0). Let S = {s, . . . , n} for some s ≥ 2 and
define Si = {i+ 1, . . . , n} for i ∈ S. Assume that the inequality
s−1∑j=1
αjxj +s−1∑j=1
βjyj ≥ δ (5–28)
is facet-defining for PB(S, ∅, S, ∅). In sequential lifting, we reintroduce the variables
(xj, yj) for j ∈ S one at the time in (5–28). Assuming that variables (xj, yj) have already
been lifted in the order j = s, . . . , i − 1, we next review how to lift variables (xi, yi) in the
121
inequalityi−1∑j=1
αjxj +i−1∑j=1
βjyj ≥ δ, (5–29)
which is assumed to be facet-defining for PB(Si−1, ∅, Si−1, ∅). To perform this lifting, we
first compute the lifting function
P i(w) = max δ −{
i−1∑j=1
αjxj +i−1∑j=1
βjyj
}
s.t.
i−1∑j=1
ajxjyj ≥ d− w
xj ∈ {0, 1}, yj ∈ [0, 1] j = 1, . . . , i− 1.
Once the lifting function P i(w) is computed, the lifting coefficients (αi, βi) can then be
obtained from P i(w) as follows.
Proposition 5.12 (Richard and Tawarmalani [100]). Let (5–29) be a valid inequality for
the set B(Si−1, ∅, Si−1, ∅). Assume that there exist (αi, βi) such that
αixi + βiyi ≥ P i(aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–30)
Then, the inequalityi∑
j=1
αjxj +i∑
j=1
βjyj ≥ δ (5–31)
is valid for B(Si, ∅, Si, ∅).The result of Proposition 5.12 can be applied recursively to construct a valid
inequality for PB from (5–28). Note that, at each step, the lifting function P i(w) must be
recomputed to account for the changes in the lifted inequality. Further, if B(S, ∅, S, ∅) isfull-dimensional, the seed inequality (5–28) is facet-defining for B(S, ∅, S, ∅), and for each
i ∈ S, the lifting coefficients (αi, βi) of the variables (xi, yi) are chosen so that (5–30) is
satisfied at equality by two points (xi, yi), then the lifted inequality will be facet-defining
for PB. Computing the lifting functions P i(w) for each i ∈ S might be computationally
undesirable. However, such computation is unnecessary when the lifting function P s(w) is
122
subadditive as described in Proposition 5.13. This observation, first made by Wolsey [132],
leads to the following result.
Proposition 5.13 (Richard and Tawarmalani [100]). Assume that (5–28) is valid
for B(S, ∅, S, ∅). Assume also that (i) P s(w) is subadditive, i.e, P s(w1) + P s(w2) ≥P s(w1 + w2) ∀w1, w2 ∈ R+ and (ii) there exist (αi, βi) for all i ∈ S such that
αixi + βiyi ≥ P s(aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–32)
Then, the inequalityn∑
j=1
αjxj +n∑
j=1
βjyj ≥ δ (5–33)
is valid for PB. Further, if (5–28) is facet-defining for B(S, ∅, S, ∅) and (αi, βi) are chosen
in a way that two points satisfy (5–32) at equality, then (5–33) is facet-defining for PB.
The main difference between Proposition 5.12 and Proposition 5.13 is that, in the
latter, the lifting coefficients of all variables (xi, yi) can be obtained from the same lifting
function P s(w) and not from P i(w) for i ∈ S. Note that in Proposition 5.13, it is sufficient
to require the subadditivity of P s(w) over w ∈ R+ since all coefficients ai in PB are
assumed to be nonnegative.
Proposition 5.12 and Proposition 5.13 consider the case where all variables (xj, yj) for
j ∈ S are fixed at (0, 0). When variables (xj, yj) are fixed at (1, 1), similar results can be
obtained. In this case, condition (5–30) must be changed to
αi(1− xi) + βi(1− yi) ≤ −P i(aixiyi − ai) (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–34)
Further, we can perform sequence-independent lifting for variables (xj, yj) fixed at (1, 1) if
the lifting function P i(w) is subadditive over w ∈ R−.
5.3.2 Lifted Inequalities by Sequence-Independent Lifting
To derive a strong inequality through lifting, we first must obtain a seed inequality. In
this section, we will use (5–23) as the seed inequality. To identify this form of inequality,
123
we introduce the notion of a cover, which is adapted from the definition of a cover for the
0−1 knapsack polytope; see Balas [14], Wolsey [130], and Hammer et al. [63].
Definition 5.1. Let C ⊆ N . We say that C is a cover for B if∑
j∈C aj > d. Further, we
define the excess of the cover as µ =∑
j∈C aj − d > 0.
We will create lifted inequalities by first partitioning the set of variables N into
(C,M, T ) in such a way that:
(A1) C is a cover for B with excess µ,
(A2) al > µ where l ∈ argmax{aj | j ∈ C},(A3)
∑j∈C∪T aj > d+ al, i.e.,
∑j∈T aj > al − µ.
Note that (A1) and (A2) might be reminiscent of conditions that make a cover
minimal for the 0−1 knapsack polytope. We note however that minimal covers require
aj > µ for all j ∈ C and not simply al > µ. Note also that (A3) implies that T 6=∅. To obtain lifted inequalities from (C,M, T ), we first fix the variables (xj, yj) for
j ∈ M to (0, 0) and the variables (xj, yj) for j ∈ C \ {l} to (1, 1). The resulting set
B(M,C \ {l},M,C \ {l}) is then defined by the inequality
alxlyl +∑j∈T
ajxjyj ≥ d−∑
j∈C\{l}aj = al − µ.
From Assumption (A3), we observe that∑
j∈T aj > al + d − ∑j∈C aj = al − µ. Since
al > µ > 0, we conclude from Proposition 5.9 that
(al − µ)xl +∑j∈T
ajyj ≥ al − µ (5–35)
is facet-defining for PB(M,C \ {l},M,C \ {l}). We will create two different families of
lifted inequalities for PB by reintroducing the variables (xj, yj) for j ∈ M ∪ C \ {l} in
different orders. To derive both facets, we first must compute the lifting function
P (w) := max (al − µ)−{(al − µ)xl +
∑j∈T
ajyj
}
124
s.t. alxlyl +∑j∈T
ajxjyj ≥ al − µ− w (5–36)
xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ {l} ∪ T.
Function P (w) can be expressed in closed-form as follows.
Proposition 5.14.
P (w) =
−∞ if w < −∑j∈T aj − µ,
w + µ if −∑j∈T aj − µ ≤ w < −µ,
0 if −µ ≤ w < 0,
w if 0 ≤ w < al − µ,
al − µ if al − µ ≤ w.
Further, P (w) is subadditive over R− and R+ respectively.
Proof. We first compute P (w). Observe that there exists an optimal solution (x∗, y∗) to
(5–36) for which x∗j = 1 for all j ∈ T and y∗l = 1 since the coefficients of xj for j ∈ T and
yl in the objective are equal to 0. Further, define a =∑
j∈T aj and y =∑
j∈T ajyj
a. Using
these definitions, we can simplify the computation of P (w) in (5–36) as:
P (w) = max (al − µ)−{(al − µ)xl + ay
}
s.t. alxl + ay ≥ al − µ− w (5–37)
xl ∈ {0, 1}, y ∈ [0, 1].
Now, we solve (5–37). When w < −a − µ, (5–37) is infeasible and so P (w) = −∞. When
w ≥ al − µ, the optimal solution is x∗l = 0 and y∗ = 0. When −a − µ ≤ w < al − µ, it is
simple to verify that optimal solutions to (5–37) are given by:
(x∗l , y
∗) =
(1, −µ−wa
) if −a− µ ≤ w < −µ,
(1, 0) if −µ ≤ w < 0,
(0, al−µ−wa
) if 0 ≤ w < al − µ.
125
Using (x∗l , y
∗) in (5–37) and substituting back∑
j∈T aj for a, we obtain the desired
expression for P (w).
Next, we prove that P (w) is subadditive over (−∞, 0] by showing that, for w1, w2 ∈R−, P (w1) + P (w2) ≥ P (w1 + w2). We consider the following three cases:
1. Assume −a − µ ≤ w1 < −µ and −a − µ ≤ w2 < −µ. If w1 + w2 < −a − µ, then
P (w1) + P (w2) = w1 + w2 + 2µ > −∞ = P (w1 + w2). If −a − µ ≤ w1 + w2 < −µ,
then P (w1) + P (w2) = w1 + w2 + 2µ ≥ w1 + w2 + µ = P (w1 + w2).
2. Assume −a − µ ≤ w1 < −µ and −µ ≤ w2 ≤ 0. If w1 + w2 < −a − µ, then
P (w1) + P (w2) = w1 + µ+ 0 > −∞ = P (w1 + w2). If −a− µ ≤ w1 + w2 < −µ, then
P (w1) + P (w2) = w1 + µ+ 0 ≥ w1 + w2 + µ = P (w1 + w2).
3. Assume −µ ≤ w1 ≤ 0 and −µ ≤ w2 ≤ 0. If w1 + w2 < −a − µ, then P (w1) +
P (w2) = 0 + 0 > −∞ = P (w1 + w2). If −a − µ ≤ w1 + w2 < −µ, then
P (w1) + P (w2) = 0 + 0 ≥ w1 + w2 + µ = P (w1 + w2). If −µ ≤ w1 + w2 < 0, then
P (w1) + P (w2) = 0 + 0 = 0 = P (w1 + w2).
Finally, we show that P (w) is subadditive over R+. We consider the following three cases:
1. Assume 0 ≤ w1 < al − µ and 0 ≤ w2 < al − µ. If w1 + w2 < al − µ, then
P (w1) + P (w2) = w1 + w2 = P (w1 + w2). If w1 + w2 ≥ al − µ, then P (w1) + P (w2) =
w1 + w2 ≥ al − µ = P (w1 + w2).
2. Assume 0 ≤ w1 < al − µ and w2 ≥ al − µ. Since w1 + w2 ≥ al − µ as w1 ≥ 0,
P (w1) + P (w2) = w1 + al − µ ≥ al − µ = P (w1 + w2).
3. Assume w1 ≥ al − µ and w2 ≥ al − µ. Since w1 + w2 ≥ al − µ, P (w1) + P (w2) =
al − µ+ al − µ > al − µ = P (w1 + w2).
It is interesting to note that P (w) is not subadditive over R as P (2al − µ) + P (−al) =
(al − µ) + (−al + µ) = 0 < al − µ = P (al − µ).
126
5.3.2.1 Lifted bilinear cover inequalities
To obtain lifted bilinear cover inequalities, we will lift first the variables (xi, yi) for
i ∈ C \ {l} from (1, 1) and then lift the variables (xi, yi) for i ∈ M from (0, 0). Since P (w)
is subadditive over (−∞, 0], we can apply sequence-independent lifting for the variables
(xi, yi) for i ∈ C \ {l} using the result of Proposition 5.13.
Proposition 5.15. Under Assumptions (A1), (A2), and (A3),
∑j∈C
(aj − µ)+xj +∑j∈T
ajyj ≥∑j∈C
(aj − µ)+ (5–38)
is facet-defining for PB(M, ∅,M, ∅).
Proof. The seed inequality (5–35) is facet-defining for PB(M,C \ {l},M,C \ {l}).Since P (w) is subadditive over (−∞, 0], it follows from Proposition 5.13 that the lifting
coefficients (αi, βi) for (xi, yi) for i ∈ C \ {l} are valid if they satisfy
αi(xi − 1) + βi(yi − 1) ≥ P (aixiyi − ai) for (xi, yi) ∈ {0, 1} × [0, 1] \ {1, 1}. (5–39)
This condition can be also written as:
βi ≤ inf0≤φ<1
−P (aiφ− ai)
1− φ, (5–40)
αi + sup0≤φ≤1
βi(1− φ) ≤ −P (−ai). (5–41)
In (5–40), it is easily verified using Assumption (A3) that aiφ− ai ∈ (−∑j∈T aj − µ, 0) for
0 ≤ φ < 1. Since P (w) ≤ 0 for w ≤ 0, we conclude that
−P (aiφ− ai)
1− φ≥ 0, ∀ 0 ≤ φ < 1,
and therefore choosing βi = 0 for i ∈ C \ {l} satisfies (5–40). Further, as βi = 0, it
is simple to verify that choosing αi = −P (−ai) = (ai − µ)+ satisfies (5–41). Finally,
note that (5–39) is satisfied at equality by the two points (0, 0) and(1, (ai−µ)+
ai
)that are
affinely independent of (1, 1). Therefore, we conclude that (5–38) is facet-defining for
PB(M, ∅,M, ∅).
127
Now, we lift the variables (xj, yj) for j ∈ M in (5–38). The corresponding lifting
function PC(w) is defined as
PC(w) := max∑j∈C
(aj − µ)+ −{∑
j∈C(aj − µ)+xj +
∑j∈T
ajyj
}
s.t.∑
j∈C∪Tajxjyj ≥
∑i∈C
ai − µ− w (5–42)
xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ C ∪ T.
We now derive a closed-form expression for PC(w). To this end, we assume without loss
of generality that C = {1, . . . , p} and that a1 ≥ a2 ≥ . . . ≥ ap. We also let q ∈ C such
that aq > µ ≥ aq+1. We define A0 = 0 and Ai =∑i
j=1 aj for all i ∈ C. We observe that
Ap =∑p
j=1 aj = d+ µ.
Proposition 5.16. For w ≥ 0,
PC(w) =
w − iµ if Ai ≤ w < Ai+1 − µ, i = 0, . . . , q − 1,
Ai − iµ if Ai − µ ≤ w < Ai, i = 1, . . . , q − 1,
Aq − qµ if Aq − µ ≤ w.
Proof. First, observe that there exists an optimal solution (x∗, y∗) of (5–42) in which
x∗j = 1 for j ∈ T and y∗j = 1 for j ∈ C since the corresponding objective coefficients are
zero. Since aq > µ ≥ aq+1, we have (aj − µ)+ = 0 for j = q + 1, . . . , p, which similarly
implies that we can assume x∗j = 1 for j = q + 1, . . . , p. Further, using the same notations
a and y as in the proof of Proposition 5.14, we can simplify the expression of PC(w) as
PC(w) = max
q∑j=1
(aj − µ)−{
q∑j=1
(aj − µ)xj + ay
}
s.t.
q∑j=1
ajxj + ay ≥q∑
j=1
aj − µ− w (5–43)
xj ∈ {0, 1}, j = 1 . . . , q, y ∈ [0, 1].
Next, we compute PC(w) by solving (5–43). Let w = Aq − µ − w. We claim that there
exists an optimal solution in which x∗1 ≤ x∗
2 ≤ . . . ≤ x∗q. Consider the following three cases.
128
1. Assume that w ≥ Aq−µ. Since w ≤ 0, x∗j = 0 for j = 1, . . . , q and y∗ = 0 is a feasible
solution that is easily verified to be optimal. Therefore, PC(w) = Aq − qµ.
2. Assume that Ai − µ ≤ w < Ai+1 − µ (i.e., Aq − Ai+1 < w ≤ Aq − Ai) for
i ∈ {1 . . . , q − 1}. Let θ = (Ai+1 − µ) − w. Clearly, 0 < θ ≤ ai+1. Define first the
solution s1i = (x∗, y∗) where x∗j = 0 for j = 1, . . . , i + 1, x∗
j = 1 for j = i + 2, . . . , q,
and y∗ = θa. When θ ≤ a, s1i is a feasible solution whose objective value we denote as
Γ∗(s1i ) = Ai+1 − (i + 1)µ − θ = w − iµ. Define now another solution s2i = (x∗, y∗)
where x∗j = 0 for j = 1, . . . , i, x∗
j = 1 for j = i + 1, . . . , q, and y∗ = 0. s2i is a feasible
solution with objective value Γ∗(s2i ) = Ai − iµ. Since ai+1 − µ ≤ a1 − µ ≤ a, it
is clear that Γ∗(s1i ) ≥ Γ∗(s2i ) when θ ≤ ai+1 − µ. Further, Γ∗(s2i ) ≥ Γ∗(s1i ) when
ai+1 − µ ≤ θ ≤ ai+1. Therefore, we conclude that P (w) ≥ Γ∗(s1i ) if Ai − µ ≤ w ≤ Ai
and P (w) ≥ Γ∗(s2i ) if Ai ≤ w < Ai+1 − µ.
We now prove that these solutions are optimal. Pick any other feasible solution
si = (x, y) that does not have the form of x1 ≤ x2 ≤ . . . ≤ xq. Define N1 = {j ∈{1, . . . , q} | xj = 1} and N0 = {j ∈ {1, . . . , q} | xj = 0}. Consider the case where
|N1| = q − i + k for k ≥ 0. Since∑q
j=1 ajxj + ay ≥ ∑qj=1 ajxj ≥ Aq − Ai−k, the
corresponding objective value is Γ∗(si) =∑q
j=1(aj−µ)(1−xj)−ay ≤ Ai−k−(i−k)µ =
Ai−iµ−∑ij=i−k+1(aj−µ) ≤ Γ∗(s2i ). Next, consider the case where |N1| = q−i−k for
k ≥ 1. Since∑q
j=1 ajxj + ay ≥ w = Aq − Ai+1 + θ from feasibility, the corresponding
objective value is Γ∗(si) =∑q
j=1(aj − µ)(1− xj)− ay ≤ Ai+1 − θ − µ(i+ k) ≤ Γ∗(s1i ).
Note that if s1i is infeasible, s2i is always feasible and dominates it.
3. Assume that 0 ≤ w < A1 − µ. Since A1 − µ ≤ a, the feasible solution x∗1 = 0, x∗
j = 1
for j = 2, . . . , q, and y∗ = A1−µ−wa
is optimal, which implies that PC(w) = w.
We now will perform sequence-independent lifting for the remaining variables in M
using Proposition 5.13. In order to apply this result. we first establish that PC(w) is
subadditive. To this end, we use the following proposition.
129
Proposition 5.17. Let ν and Di for i = 0, 1, . . . , r be nonnegative integers. Assume that
ν > 0, D0 = 0, and Di ≥ Di−1 + ν for i = 1, . . . , r. Then the function
g(w) :=
w − iν if Di ≤ w < Di+1 − ν, i = 0, . . . , r − 1,
Di − iν if Di − ν ≤ w < Di, i = 1, . . . , r − 1,
Dr − rν if Dr − ν ≤ w
is subadditive over R+ if and only if Di +Dj ≥ Di+j for 0 ≤ i, j ≤ r with i+ j ≤ r.
Proof. Assume that Di +Dj ≥ Di+j for 0 ≤ i ≤ j ≤ r with i + j ≤ r. We want to prove
that g(x) + g(y) ≥ g(x + y) for x, y ∈ R+. Assume for a contradiction that there exists
x, y ∈ R+ such that g(x) + g(y) < g(x + y). We claim first that there exists x′, y′ ∈ R+
with x′ = Di for some i ∈ {0, . . . , r} such that g(x′) + g(y′) < g(x′ + y′). We consider three
cases:
1. If Di ≤ x ≤ Di+1 − ν for i ∈ {0, . . . , r − 1}, then let x′ = Di and y′ = y. Clearly,
g(x′) = g(x) + Di − x and g(y′) = g(y). Further, g(x′ + y′) = g(x + y + Di − x) ≥g(x + y) + Di − x since Di ≤ x and the function g has slope 0 or 1. Therefore, we
have that g(x′) + g(y′) = g(x) + g(y) +Di − x < g(x+ y) +Di − x ≤ g(x′ + y′).
2. If Di−ν ≤ x ≤ Di for i ∈ {1, . . . , r−1}, then let x′ = Di and y′ = y+x−Di. Clearly,
g(x′) = g(x) and g(y′) ≤ g(y) since x ≤ Di and g is nondecreasing. Therefore, we
have that g(x′) + g(y′) ≤ g(x) + g(y) < g(x+ y) = g(x′ + y′).
3. If Dr−ν ≤ x, then let x′ = Dr−ν and y′ = y. Clearly, g(x′) = g(x) and g(y′) = g(y).
Further, g(x′ + y′) = g(Dr − ν + y) = g(x + y) since y ≥ 0 and x + y ≥ Dr − ν.
Therefore, we have that g(x′) + g(y′) = g(x) + g(y) < g(x+ y) = g(x′ + y′).
We claim next that there exists x, y ∈ R+ with x = Di and y = Dj for some i, j ∈{0, . . . , r} such that g(x) + g(y) < g(x+ y). We consider three cases:
1. If Dj ≤ y′ ≤ Dj+1 − ν for j ∈ {0, . . . , r − 1}, then let x = x′ and y = Dj. Clearly,
g(x) = g(x′) and g(y) = g(y′) +Dj − y′. Further, g(x + y) = g(x′ + y′ +Dj − y′) ≥
130
g(x′ + y′) +Dj − y′ since Dj ≤ y′ and the function g has slope 0 or 1. Therefore, we
have that g(x) + g(y) = g(x′) + g(y′) +Dj − y′ < g(x′ + y′) +Dj − y′ ≤ g(x+ y).
2. If Dj − ν ≤ y′ ≤ Dj for j ∈ {1, . . . , r − 1}, then let x = x′ and y = Dj. Clearly,
g(x) = g(x′) and g(y) = g(y′). Further, g(x + y) ≥ g(x′ + y′) since y′ ≤ Dj and g is
nondecreasing. Therefore, we have that g(x) + g(y) = g(x′) + g(y′) < g(x′ + y′) ≤g(x+ y).
3. If Dr−ν ≤ y′, then let x = x′ and y = Dr−ν. Clearly, g(x) = g(x′) and g(y) = g(y′).
Further, g(x + y) = g(x′ + Dr − ν) = g(x′ + y′) since x′ ≥ 0 and x′ + y′ ≥ Dr.
Therefore, we have that g(x) + g(y) = g(x′) + g(y′) < g(x′ + y′) = g(x+ y).
We conclude that there exist i, j ∈ {0, . . . , r} such that g(Di) + g(Dj) < g(Di +Dj). Since
g(Di) = Di − iν and g(Dj) = Dj − jν, we have that Di +Dj − (i+ j)ν < g(Di +Dj). We
consider the following three cases:
1. If Dq ≤ Di +Dj < Dq+1 − ν for q ∈ {0, . . . , r − 1}, then g(Di +Dj) = Di +Dj − qν.
Since Di+Dj−(i+j)ν < g(Di+Dj), it follows that Di+Dj−(i+j)ν < Di+Dj−qν,
which implies that q < i + j. Since q is integer, we obtain that q + 1 ≤ i + j and
Dq+1 ≤ Di+j. Further, since Di +Dj < Dq+1 − ν < Dq+1 ≤ Di+j, we conclude that
Di +Dj < Di+j, which is a contradiction to the hypothesis Di +Dj ≥ Di+j.
2. If Dq − ν ≤ Di +Dj < Dq for q ∈ {1, . . . , r − 1}, then g(Di +Dj) = Dq − qν. Since
Di +Dj − (i+ j)ν < g(Di +Dj), it follows that Di +Dj − (i+ j)ν < Dq − qν. Using
Dq−ν ≤ Di+Dj, we have that Dq−ν−(i+j)ν ≤ Di+Dj−(i+j)ν < Dq−qν, which
implies that i + j + 1 > q. Since q is integer, it is clear that q ≤ i + j. Further, since
Di +Dj < Dq ≤ Di+j, we conclude that Di +Dj < Di+j, which is a contradiction.
3. If Di+Dj ≥ Dr−ν, then g(Di+Dj) = Dr−rν. Since Di+Dj−(i+j)ν < g(Di+Dj),
it follows that Di +Dj − (i+ j)ν < Dr − rν. Using Dr − ν ≤ Di +Dj, we obtain that
Dr − ν − (i + j)ν ≤ Di +Dj − (i + j)ν < Dr − rν, which implies that i + j + 1 > r.
Since i, j,and r are integers, i + j ≥ r. Further, since Dl ≥ Dl−1 + ν for l = 1, . . . , r,
we have that Dj ≥ Dr−i + (j + i − r)ν. Combining Dj ≥ Dr−i + (j + i − r)ν and
131
Di +Dj − (i+ j)ν < Dr − rν, we obtain that Di +Dr−i − rν < Dr − rν. Therefore,
we conclude that Di +Dr−i < Dr, which is a contradiction. This completes the first
part of the proof.
To prove the reverse implication, assume now that g is subadditive. We want to prove
that Di + Dj ≥ Di+j for 0 ≤ i ≤ j ≤ r with i + j ≤ r. As shown before, we can take
i and j such that g(Di) = Di − iν and g(Dj) = Dj − jν. Since g is subadditive, i.e.,
g(Di) + g(Dj) ≥ g(Di +Dj), it follows that Di +Dj − (i+ j)ν ≥ g(Di +Dj). We consider
the following three cases:
1. If Dq ≤ Di +Dj < Dq+1 − ν for q ∈ {0, . . . , r − 1}, then g(Di +Dj) = Di +Dj − qν.
Since Di+Dj−(i+j)ν ≥ g(Di+Dj), it follows that Di+Dj−(i+j)ν ≥ Di+Dj−qν,
which implies q ≥ i+ j and Dq ≥ Di+j. Therefore, we obtain that Di +Dj ≥ Di+j.
2. If Dq − ν ≤ Di + Dj < Dq for q ∈ {1, . . . , r − 1}, then g(Di + Dj) = Dq − qν >
Di+Dj−qν. Since Di+Dj−(i+j)ν ≥ g(Di+Dj), it follows that Di+Dj−(i+j)ν >
Di + Dj − qν, which implies that i + j < q. Using Dq ≥ Di+j + (q − (i + j))ν, we
obtain that Di +Dj − (i+ j)ν ≥ Dq − qν ≥ Di+j − (i+ j)ν. Therefore, we conclude
that Di +Dj ≥ Di+j.
3. If Di+Dj ≥ Dr−ν, then g(Di+Dj) = Dr−rν. Since Di+Dj−(i+j)ν ≥ g(Di+Dj),
it follows that Di +Dj − (i + j)ν ≥ Dr − rν. Since i + j ≤ r from the assumption,
Dr ≥ Di+j+(r−(i+j))ν. Therefore, we obtain that Di+Dj−(i+j)ν ≥ Di+j−(i+j)ν,
which concludes that Di +Dj ≥ Di+j.
We now prove that the lifting function PC(w) is subadditive.
Corollary 5.1. The lifting function PC(w) is subadditive over R+.
Proof. In Proposition 5.17, define ν = µ, r = q, and Di = Ai. Since ai ≥ µ for i = 1, . . . , q,
it is clear that Ai ≥ Ai−1 + µ. Further, since Ai is defined as the sum of the largest i
coefficients in C, it is clear that Ai + Aj ≥ Ai+j for 0 ≤ i, j ≤ p with i + j ≤ p. Therefore,
Proposition 5.17 shows that PC(w) is subadditive over R+.
132
We next illustrate the results of Propositions 5.15 and 5.16 as well as Corollary 5.1 on
an example.
Example 5.2. Consider the 0−1 mixed-integer bilinear covering set
B ={(x, y) ∈ {0, 1}5 × [0, 1]5
∣∣∣ 21x1y1 + 19x2y2 + 17x3y3 + 15x4y4 + 10x5y5 ≥ 20}.
Let (C,M, T ) = ({4, 5}, {1, 2}, {3}). Clearly, (C,M, T ) satisfies Assumptions (A1) − (A3)
since C is a cover with µ = 5, al = a4 > µ and∑
j∈C∪T aj = 17+15+10 > 20+15 = d+al.
We obtain from Proposition 5.15 that the inequality
17y3 + 10x4 + 5x5 ≥ 15 (5–44)
is facet-defining for PB(M, ∅,M, ∅). Using the result of Proposition 5.16, we obtain that
the lifting function PC(w) is given by
PC(w) =
w if 0 ≤ w < 15− 5 = 10,
10 if 10 ≤ w < 15,
w − 5 if 15 ≤ w < 15 + 10− 5 = 20,
20− 5 if 20 ≤ w.
Corollary 5.1 establishes that this function is subadditive over R+. Function PC(w) is
represented in Figure 5-1.
We now compute the lifting coefficients of variables (xi, yi) for i ∈ M from PC(w). It
follows from Proposition 5.13 that lifting coefficients (αi, βi) for i ∈ M must be chosen in
such a way that
αixi + βiyi ≥ PC(aixiyi) for ∀(xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–45)
For the problem described in Example 5.2, PC(aixiyi) is represented in Figure 6-2 (a). In
this figure, we obtain that PC(aixiyi) is constant when xi = 0 and is equal to PC(aiyi)
when xi = 1. Condition (5–45) requires that the lifting coefficients (αi, βi) must be
133
0 5 10 15 200
2
4
6
8
10
12
14
16P C(w)
A1 − µ A1 A2 − µ
w
Figure 5-1. Lifting function PC(w) of (5–44)
0
0.5
1
0
0.5
10
5
10
15
xy 0
0.5
1
0
0.5
10
5
10
15
xy
(a) (b)
Figure 5-2. Deriving lifting coefficients for Example 5.3
chosen in such a way that the plane αixi + βiyi overestimates the function PC(aixiyi)
over {0, 1} × [0, 1]. Possible overestimating planes are represented in Figure 6-2 (b).
A similar geometric interpretation was already used in Richard and Tawarmalani [100]
to obtain lifted inequalities for mixed-integer bilinear knapsack sets. It follows that an
134
overestimating plane αixi + βiyi can be obtained by first deriving a concave envelope p(w)
of PC(w) over [0, ai]. This observation motivates the following result.
Lemma 5.1. Assume that ai > 0. Define
qi :=
0 if ai < A1 − µ,
j if Aj − µ ≤ ai ≤ Aj+1 − µ, j = 1, . . . , q − 1,
q if Aq − µ < ai.
Let Qi0 = 0, Qi
j = Aj−µ for j = 1, . . . , qi and Qiqi+1 = ai. Further, redefine aqi+1 = ai−Qi
qi.
Define pi0(w) = w and pij(w) = PC(Qij) +
PC(Qij+1)−PC(Qi
j)
aj+1(w −Qi
j) for j = 1, . . . , qi. Then,
the function
p(w) := min{pij(w)
∣∣∣ j ∈ {0, . . . , qi}}
(5–46)
is a concave overestimator of PC(w) over [0, ai].
Proof. It is easily seen that p(w) is concave since p(w) is defined as the minimum of
a finite number of linear functions. If p(w) = pi0(w) = w, then it is clear that w
overestimates PC(w) since PC(w) is a continuous piecewise linear function with slopes
0 and 1. Note that the slope of pij(w) is no less than that of pij′(w) for j < j′ since
aj+1 ≥ aj′+1 impliesaj+1−µ
aj+1≥ aj′+1−µ
aj′+1. Therefore, the minimum of (5–46) is attained
at j = l if w ∈ [Qil, Q
il+1] for l ∈ {1, . . . , qi}. Further, since pil(Q
il) = PC(Qi
l),
pil(Qil+1) = PC(Qi
l+1), and PC(w) is convex for w ∈ [Qil, Q
il+1], we conclude that
p(w) = pil(w) ≥ PC(w).
Observe that in Lemma 5.1, qi represents the index of the interval to which ai
belongs. Next, we compute the lifting coefficients for the variables (xi, yi) for i ∈ M using
the subadditivity of PC(w) and the result of Lemma 5.1.
Theorem 5.1. Under Assumptions (A1), (A2), and (A3), the lifted bilinear cover
inequality
∑j∈C
(aj − µ)+xj +∑j∈T
ajyj +∑i∈M
αixi +∑i∈M
βiyi ≥∑j∈C
(aj − µ)+ (5–47)
135
is facet-defining for PB if
(αi, βi) ∈ (0, ai)⋃(
PC(ai), 0) qi⋃j=1
(PC(Qi
j)−PC(Qi
j+1)− PC(Qij)
aj+1
Qij,PC(Qi
j+1)− PC(Qij)
aj+1
ai
)
for i ∈ M in (5–47) where Qij and qi are as defined in Lemma 5.1.
Proof. Because PC(w) is subadditive over R+, we know that (5–47) is valid for PB if the
lifting coefficients (αi, βi) of (xi, yi) for i ∈ M are chosen to satisfy the condition
αixi + βiyi ≥ PC(aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–48)
Condition (5–48) can be rewritten as:
βiφ ≥ PC(0) for 0 < φ ≤ 1, (5–49)
αi + βiφ ≥ PC(aiφ) for 0 ≤ φ ≤ 1. (5–50)
To prove that (5–48) is facet-defining for PB, we also need to show two points (xi, yi) for
which (5–47) is satisfied at equality. First, consider the case where (αi, βi) = (0, ai). Since
αi + βiφ = βiφ = aiφ ≥ PC(aiφ) ≥ PC(0), (5–49) and (5–50) are satisfied. Further,
we see that (5–48) is satisfied at equality at the two points (1, 0) and(1,min
{1, A1−µ
ai
})
since PC(0) = 0 and PC(w) = w where 0 ≤ w ≤ A1 − µ. Next, consider the case where
(αi, βi) = (PC(ai), 0). Condition (5–49) is satisfied since βi = 0 and PC(0) = 0. Condition
(5–50) also holds because αi = PC(ai) and PC(w) is non-decreasing for w ∈ R+. Further,
(5–48) is satisfied at equality at the two points, (0, φ) for some 0 < φ < 1 and (1, 1).
Finally, consider
(αi, βi) =
(PC(Qi
j)−PC(Qi
j+1)− PC(Qij)
aj+1
Qij,PC(Qi
j+1)− PC(Qij)
aj+1
ai
)
136
for i ∈ N \ C. Clearly, (αi, βi) satisfies (5–49) since βi ≥ 0 and PC(0) = 0. From
Lemma 5.1, we have that
PC(aiφ) ≤ PC(Qij) +
PC(Qij+1)−PC(Qi
j)
aj+1
(aiφ−Qi
j
)
=(PC(Qi
j)−PC(Qi
j+1)−PC(Qij)
aj+1Qi
j
)+
PC(Qij+1)−PC(Qi
j)
aj+1aiφ
= αi + βiφ,
showing that (αi, βi) satisfy (5–50). Further, (5–48) is satisfied at equality at the two
points(1,
Qij
ai
)and
(1,
Qij+1
ai
). Therefore, we conclude that (5–47) is facet-defining for
PB.
Note that since we typically have several choices for the values of the lifting coefficient
(αi, βi), the family of inequalities (5–47) contain an exponential number of members. We
illustrate this characteristics of lifted bilinear cover inequalities in Example 5.3.
Example 5.3. In Example 5.2, we established that (5–44) is facet-defining for PB(M, ∅,M, ∅)using the result of Proposition 5.16. The lifting function PC(w) was also obtained in
closed-form. Applying Theorem 5.1, we obtain the nine inequalities
21y1
5x1 +212y1
15x1
+
19y2
509x2 +76
9y2
14x2
+ 17y3 + 10x4 + 5x5 ≥ 15
which are all facet-defining for PB. The reason that there are three choices for the lifting
coefficients of (x1, y1) is illustrated in Figure 6-2(b). The fact that there are three choices
for (x2, y2) follows similarly since coefficient a2 falls in the second interval.
5.3.2.2 Lifted reverse bilinear cover inequalities
In Theorem 5.1, we derived lifted bilinear cover inequalities by first lifting the
variables (xj, yj) for j ∈ C \ {l} and then lifting the remaining variables (xj, yj) for j ∈ M .
Here, we derive another family of lifted inequalities that we call lifted reverse bilinear cover
inequalities by changing the lifting order: we start the lifting procedure with the same
seed inequality (5–35), but we now lift the variables (xj, yj) for j ∈ M before the variables
137
(xj, yj) for j ∈ C \ {l}. In this case, we can relax some assumptions on the partition
(C,M, T ). In particular, we replace (A2) with
(A2’) al > µ for some l ∈ C.
Proposition 5.18. Suppose that Assumptions (A1), (A2’), and (A3) hold. Then,
(al − µ)xl +∑j∈M
min{aj, al − µ}xj +∑j∈T
ajyj ≥ al − µ (5–51)
is facet-defining for PB(∅, C \ {l}, ∅, C \ {l}).
Proof. It follows from Proposition 5.9 that
(al − µ)xl +∑j∈T
ajyj ≥ al − µ
is facet-defining for PB(M,C \ {l},M,C \ {l}). Its lifting function P (w) is derived in
Proposition 5.14 where it is also proven to be subadditive over R+. Therefore, lifting
coefficients (αi, βi) for (xi, yi) for i ∈ M are valid if they satisfy the condition:
αixi + βiyi ≥ P (aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–52)
Condition (5–52) can be rewritten as:
βiφ ≥ P (0) for 0 < φ ≤ 1, (5–53)
αi + βiφ ≥ P (aiφ) for 0 ≤ φ ≤ 1. (5–54)
We now show that (αi, βi) = (min{ai, al − µ}, 0) are valid lifting coefficients. Clearly,
βi = 0 satisfies (5–53) since P (0) = 0. Further, since P (aiφ) = min{aiφ, al − µ}, it is alsoclear that αi = min{ai, al − µ} ≥ min{aiφ, al − µ} = P (aiφ). To show that (5–51) is
facet-defining for PB(∅, C \ {l}, ∅, C \ {l}), it suffices to verify that the two points (1, 0)
and (1, 1) satisfy (5–52) at equality.
We emphasize that the above proof requires that Assumptions (A2’) holds for
l ∈ argmin{aj | j ∈ C}, and not for l ∈ argmax{aj | j ∈ C} as Assumption (A2) in
Proposition 5.15. We also mention that lifting coefficients (αi, βi) = (0, ai) for i ∈ M are
138
valid for (5–52). These coefficients yield facet-defining inequalities for PB(∅, C \ {l}, ∅, C \{l}) because (5–52) is satisfied at equality for (1, 0) and
(1,min
{1, al−µ
ai
}). However,
these variables could have been treated directly as elements of T in (5–35) since adding
more elements to T does not violated Assumption (A3).
To obtain facet-defining inequalities for PB, we lift the remaining variables (xj, yj) for
j ∈ C \ {l} in (5–51). To this end, we first compute the function
PM(w) := min
{(al − µ)xl +
∑j∈M
min{aj, al − µ}xj +∑j∈T
ajyj
}− (al − µ)
s.t. alxlyl +∑
j∈M∪Tajxjyj ≥ al − µ+ w (5–55)
xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ {l} ∪M ∪ T.
It is easily verified that the lifting function PM(w) corresponding with (5–51) satisfies
PM(w) = −PM(−w). Let M = M1 ∪ M2 where M1 = {i ∈ M | ai > al − µ} and
M2 = M \ M1. Assume without loss of generality that {l} ∪ M1 = {1, . . . , q} and
a1 ≥ a2 ≥ . . . ≥ aq where q = |M1| + 1. Further, define A0 = 0 and Ai =∑i
j=1 aj for
all i = 1, . . . , q. Observe that al +∑
j∈M∪T aj = Aq +∑
j∈M2aj +
∑j∈T aj. We derive a
closed-form expression for PM(w) in the following proposition.
Proposition 5.19. For w ≥ 0,
PM(w) =
−al + µ if w < −al + µ,
w − Ai + i(al − µ) if Ai − al + µ ≤ w < Ai, i = 0, . . . , q − 1,
i(al − µ) if Ai ≤ w < Ai+1 − al + µ, i = 0, . . . , q − 1,
w − Aq + q(al − µ) if Aq − al + µ ≤ w ≤ ∑j∈N aj − d,
∞ if∑
j∈N aj − d < w.
Proof. First, we observe that there exists an optimal solution (x∗, y∗) to (5–55) in which
x∗j = 1 for j ∈ T and y∗j = 1 for j ∈ M ∪ {l} since the objective coefficients corresponding
to these variables are zero. Using the same notations a =∑
j∈T aj and y =∑
j∈T ajyj
aas in
139
the proof of Proposition 5.14, we simplify the expression of PM(w) as:
PM(w) = min
∑
j∈{l}∪M1
(al − µ)xj +∑j∈M2
ajxj + ay
− (al − µ)
s.t.∑
j∈{l}∪M1
ajxj +∑j∈M2
ajxj + ay ≥ al − µ+ w (5–56)
xj ∈ {0, 1} ∀j ∈ {l} ∪M1 ∪M2, y ∈ [0, 1].
Further, after introducing a =∑
j∈M2aj + a and y =
∑j∈M2
ajxj+ay
a, we claim that PM(w)
can be written as:
PM(w) = min
∑
j∈{l}∪M1
(al − µ)xj + ay
− (al − µ)
s.t.∑
j∈{l}∪M1
ajxj + ay ≥ al − µ+ w (5–57)
xj ∈ {0, 1} ∀j ∈ {l} ∪M1, y ∈ [0, 1].
We next show that (5–56) and (5–57) are equivalent. To do so, we show that (5–56) has a
feasible solution (x∗M1
, x∗M2
, y∗) with objective value ζ∗ if and only if (5–57) has a feasible
solution (x∗M1
, y∗) of the objective value ζ∗. On the one hand, given (x∗M1
, x∗M2
, y∗), we can
obtain (x∗M1
, y∗) directly from the definition of y. The objective values of these two points
are identical. On the other hand, observe that a =∑
j∈T aj > al − µ ≥ ai for all i ∈ M2
because of Assumption (A3) and the definition of M2. Let M2 = {a1, . . . , ar} and assume
without loss of generality that a1 ≥ . . . ≥ ar. Further, define A0 = 0 and Ai =∑i
j=1 aj
for i = 1, . . . , r. Then, for a given (x∗M1
, y∗), we obtain (x∗M1
, x∗M2
, y∗) as follows. For
m = max{i ∈ M2 | Ai ≤ ay∗}, set x∗j = 1 for j ≤ m, x∗
j = 0 otherwise, and y∗ = ay∗−Am
a.
This solution is feasible since 0 ≤ ay∗−Am
a≤ 1 and
∑j∈{l}∪M1
ajx∗j +
∑j∈M2
ajx∗j + ay∗ =
∑j∈{l}∪M1
ajx∗j + Am + ay∗ − Am
=∑
j∈{l}∪M1ajx
∗j + ay∗.
It also has the same objective value of (x∗M1
, y∗).
140
Next, we compute an optimal solution for (5–57). We claim that there exists an
optimal solution (x∗, y∗) to (5–57) in which x∗1 ≥ x∗
2 ≥ . . . ≥ x∗q. For x
∗ 6= 0, let
t = max{j ∈ {1, . . . , q} | x∗j = 1}. Assume that we are given an optimal solution (x∗, y∗)
for which x∗i < x∗
t and i < t for some i, t ∈ {1 . . . , q}. In this case, we can find another
solution (x, y∗) with objective value at least as good as that of (x∗, y∗) as follows. We
define x as xk = x∗k if k 6= i and k 6= t, xi = x∗
t , and xt = x∗i . The solution (x, y∗) is feasible
and has the same objective value since ai ≥ at.
We now obtain a closed-form for PM(w) by solving (5–57). Consider the case where
w < −al + µ. In this case, the optimal solution is x∗j = 0 for j ∈ {l} ∪ M1, and y∗ = 0,
which implies that PM(w) = −al + µ. When −al + µ ≤ w < 0, the optimal solution
is x∗j = 0 for j ∈ {l} ∪ M1, and y∗ = al−µ+w
a, which implies that PM(w) = w. When
0 ≤ w < A1 − al +µ, the optimal solution is x∗1 = 1 and x∗
j = 0 for j ∈ {l} ∪M1, j 6= 1, and
y∗ = 0, which implies that PM(w) = 0. If w >∑
j∈N\C aj + µ =∑
j∈N aj − d, then PM(w)
is infeasible, which implies that PM(w) = ∞. If 0 ≤ w < Aq − al + µ, the following feasible
solution:
x∗j =
1 if j = 1, . . . ,m,
0 if j = m+ 1 and w < Am,
1 if j = m+ 1 and w ≥ Am,
0 if j = m+ 2, . . . , q,
and y∗m =
w−Am+al−µa
if w < Am,
0 if w ≥ Am,
is optimal where m = max{i ∈ {l} ∪M1 | Ai − al + µ ≤ w}. Note that a + Aq − al + µ =∑
j∈N\C aj + µ. If Aq − al + µ ≤ w ≤ ∑j∈N aj − d, then the optimal solution is x∗
j = 1 for
all j ∈ {l} ∪ M1 and y∗ = w−Aq+al−µ
a, which implies that PM(w) = w − Aq + q(al − µ).
Using these optimal solutions, we obtain the desired form for PM(w).
To apply Proposition 5.13, we must verify that the function PM(w) is subadditive
over R−. Since PM(w) = −PM(−w), this corresponds to verifying that PM(w) is
superadditive over R+.
141
Proposition 5.20. The lifting function PM(w) is superadditive over [0,∑
j∈N aj − d].
Proof. We show that PM(w) is superadditive using the result of Proposition 27 in Richard
and Tawarmalani [100]. Let λ = al − µ, r = q, Ci = Ai for i = 1, . . . , q, and P(w) = PM (w)al−µ
.
It is clear that Ai + Aj ≥ Ai+j for 0 ≤ i ≤ j ≤ q with i + j ≤ q since Ai is the sum of the
largest i coefficients in M1 ∪ {l}. Therefore, we conclude that PM(w) is superadditive over
[0,∑
j∈N aj − d].
We next illustrate the results of Propositions 5.18, 5.19, and 5.20 on an example.
Example 5.4. For the set B of Example 5.2, consider the partition (C,M, T ) =
({3, 4}, {5}, {1, 2}). This partition satisfies Assumptions (A1), (A2’) and (A3) for
l = 4 ∈ C since C is a cover with µ = 12 such that a4 > µ and∑
j∈C∪T aj =
21 + 19 + 17 + 15 > 20 + 15 = d+ al. We obtain from Proposition 5.18 that
3x4 + 3x5 + 21y1 + 19y2 ≥ 3 (5–58)
is facet-defining for PB(∅, C \ {l}, ∅, C \ {l}). Further, the lifting function PM(w) is given
by
PM(w) =
0 if 0 ≤ w < 12,
w − 12 if 12 ≤ w < 15,
3 if 15 ≤ w < 22,
w − 19 if 22 ≤ w < 62,
as described in Proposition 5.19.
Similar to Theorem 5.1, we compute the lifting coefficients for the variables (xi, yi) for
i ∈ C \ {l} using Proposition 5.13.
Theorem 5.2. Suppose that Assumptions (A1), (A2’), and (A3) hold. Then, the lifted
reverse bilinear cover inequality
(al − µ)xl +∑
i∈C\{l}PM(ai)xi (5–59)
142
+∑j∈M
min{aj, al − µ}xj +∑j∈T
ajyj ≥ (al − µ) +∑
i∈C\{l} PM(ai)
is facet-defining for PB.
Proof. Since PM(w) is superadditive over w ∈ [0,∑
j∈N aj − d], the lifting coefficients
(αi, βi) of the variables (xi, yi) for i ∈ C \ {l} are valid if they satisfy the condition:
αi(1− xi) + βi(1− yi) ≤ PM(ai − aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {1, 1}. (5–60)
The above condition can be rewritten as:
βi ≤ inf0≤φ<1
PM(ai − aiφ)
1− φ, (5–61)
αi + sup0≤φ≤1
βi(1− φ) ≤ PM(ai). (5–62)
Because of Assumption 5.2 and C ⊆ N , we know that ai ≤∑
j∈N aj − d for all i ∈ C.
Observe that
inf0≤φ<1
PM(ai − aiφ)
1− φ= 0
since PM(ε) = 0 for sufficiently small ε > 0. Therefore, choosing βi = 0 satisfies (5–61).
Moreover, as βi = 0, it is easily verified that choosing αi = PM(ai) satisfies (5–62). Finally,
note that (5–60) is tight for the two points (0, 0) and(1, (ai−A1+al−µ)+
ai
), which proves that
(5–59) is facet-defining for PB.
Note that a lifted reverse bilinear cover inequality (5–59) does not yield an exponential
number of facet-defining inequalities. We will illustrate the reason in Example 5.5. This is
a significant difference from lifted bilinear cover inequalities (5–47). Moreover, we observe
that some of inequalities (5–59) might also be explained as lifted bilinear cover inequalities
(5–47). However, there exist inequalities (5–59) that cannot be obtained as one of the
lifted bilinear cover inequalities (5–47). We will illustrate this difference on the following
example.
Example 5.5. For the partition (C,M, T ) = ({3, 4}, {5}, {1, 2}), we established in
Example 5.4 that (5–58) is facet-defining for PB(∅, C \ {l}, ∅, C \ {l}). Further, the lifting
143
function PM(w) was obtained in closed-form and is depicted in Figure 5.5 (a). Applying
Theorem 5.2, we obtain the following lifted reverse bilinear cover inequality
3x3 + 3x4 + 3x5 + 21y1 + 19y2 ≥ 6
which is facet-defining for PB. We observe in Figure 5.5 (b) that this is the only choice
of coefficients that yields the plane underestimating PM(ai − aixiyi) over (xi, yi) ∈{0, 1} × [0, 1] \ {1, 1}. Further, this inequality cannot be obtained as a lifted bilinear cover
inequality (5–47). This is because the coefficients of the binary variables xi are equal while
this could only happen in lifted bilinear cover inequalities (5–47) when (aj − µ)+ = 0 for
j ∈ C.
0
0.5
1
0
0.5
10
1
2
3
4
5
xy
0
0.5
1
0
0.5
10
1
2
3
4
5
xy
(a) (b)
Figure 5-3. Deriving lifting coefficients for Example 5.5
5.3.3 Inequalities through Approximate Lifting
We now derive another family of lifted inequalities from the seed inequality developed
in Proposition 5.11. To this end, we first identify a cover C ⊆ N that satisfies the
condition of Proposition 5.11. In particular, we assume that
(C1)∑
j∈C aj − ak ≥ d for all k ∈ C, i.e., al = maxj∈C aj ≤ µ,
144
(C2)∑
j∈C aj − ak − am < d for any k,m ∈ C, i.e., ak + am > µ for all k,m ∈ C.
In the following discussions, we consider covers C with |C| ≥ 2 since we require
Assumptions (C1) and (C2) to be satisfied. When fixing the variables (xi, yi) for i ∈ N \ Cto (0, 0), it follows from Proposition 5.11 that
∑j∈C
xj ≥ |C| − 1 (5–63)
is facet-defining for PB(N \ C, ∅, N \ C, ∅).We now lift the remaining variables (xi, yi) for i ∈ N \ C. The lifting function
corresponding to (5–63) is defined as
Φ(w) := max (|C| − 1)−∑j∈C
xj
s.t.∑j∈C
ajxjyj ≥ d− w (5–64)
xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ C.
We assume without loss of generality that C = {1, . . . , r} and that a1 ≤ a2 ≤ . . . ≤ ar.
We define µ = a1 + a2 − µ, B0 = 0 and Bi =∑i
j=1 aj+2 for i = 1, . . . , r − 2. Observe that
µ > 0 and a1 + a2 +Br−2 = d+ µ where µ is the excess of cover C.
Proposition 5.21. For w ≥ 0,
Φ(w) =
0 if 0 ≤ w < µ,
i+ 1 if Bi + µ ≤ w < Bi+1 + µ, i = 0, . . . , r − 3,
r − 1 if Br−2 + µ ≤ w.
Proof. There exists an optimal solution (x∗, y∗) to (5–64) in which y∗j = 1 for j ∈ C since
all the objective coefficients corresponding to these variables are zero. Hence, Φ(w) can be
rewritten as
Φ(w) = max (|C| − 1)−∑j∈C
xj
145
s.t.∑j∈C
ajxj ≥ d− w (5–65)
xj ∈ {0, 1} ∀j ∈ C.
Further, we claim that there exists an optimal solution x∗ to (5–65) in which
x∗1 ≤ x∗
2 ≤ . . . ≤ x∗r. (5–66)
This is because, given any optimal solution x∗ to (5–65) with x∗i > x∗
j for i < j, the
solution x defined as xk = x∗k if k 6= i and k 6= j, xi = x∗
j , and xj = x∗i , is feasible and
has the same objective value. Now, we compute Φ(w) by solving (5–65). If 0 ≤ w < µ, it
is clear that x∗1 = 0 and x∗
j = 1 for j = 2, . . . , r. For the remaining case, it follows from
(5–66) that
x∗j =
0 if j = 1, . . . , t+ 2,
1 if j = t+ 3, . . . , r,
where t := max{i ∈ {0, . . . , r − 2} | Bi + µ ≤ w} is an optimal solution to (5–65), which
shows the result.
We now perform sequential lifting of the pair of variables (xi, yi) for i ∈ N \ C. We
assume that all variables (xj, yj) for j ∈ N \C where j < i have already been lifted. Lifting
coefficients will be derived from the lifting functions
Φi(w) = max (|C| − 1)−∑j∈C
xj +∑
1≤j<i,j∈N\C(αjxj + βjyj)
s.t.∑j∈C
ajxjyj +∑
1≤j<i,j∈N\Cajxjyj ≥ d− w
xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ N.
In Section 5.3.2.1, we were able to show that lifting functions were subadditive, a
property that leads to sequence-independent lifting. Unfortunately, the lifting function
Φ(w) is not subadditive. To handle such situations, Gu et al. [62] proposed to use
146
approximate lifting. Following their approach, we say that Ψ(w) is a valid subaddi-
tive approximation of Φ(w) if Ψ(w) ≥ Φ(w) for all w ∈ R and Ψ(w) is subadditive.
We say that a valid subadditive approximation Ψ(w) is nondominated if there is no
other valid subadditive approximation Ψ′(w) with Ψ′(w) ≤ Ψ(w) for all w ∈ R
and Ψ′(w) < Ψ(w) for some w ∈ R. We also define the notion of maximal set
E = {w ∈ R+ | Φi(w) = Φ(w) ∀i ∈ N \ C, and for all lifting orders}. A valid
subadditive approximation Ψ(w) of Φ(w) is called maximal if Ψ(w) = Φ(w) for all w ∈ E.
It is clear that a maximal nondominated approximation of Φ leads to strong inequalities
that can be obtained efficiently for PB. To construct such an approximation of Φ(w), we
will use the following proposition.
Proposition 5.22. Let λ and Ci for i = 0, 1, . . . , s be nonnegative integers. Assume that
λ > 0, C0 = 0 and Ci−1 + λ ≤ Ci for i = 1, . . . , s. Then the function
h(w) =
i+ w−Ci
λif Ci ≤ w < Ci + λ, i = 0, . . . , s,
i if Ci−1 + λ ≤ w < Ci, i = 1, . . . , s,
s+ 1 if Cs + λ ≤ w.
is subadditive over R+ if and only if Ci + Cj ≤ Ci+j for 0 ≤ i ≤ j ≤ s with i+ j ≤ s.
Proof. Assume that Ci + Cj ≤ Ci+j for 0 ≤ i ≤ j ≤ s with i + j ≤ s. We want to prove
that h(x) + h(y) ≥ h(x + y) for x, y ∈ R+. Assume for a contradiction that there exists
x, y ∈ R+ such that h(x) + h(y) < h(x + y). We claim first that there exists x′, y′ ∈ R+
with x′ = Ci for some i ∈ {0, . . . , s} such that h(x′) + h(y′) < h(x′ + y′). Consider the
following three cases:
1. If Ci ≤ x ≤ Ci + λ for i ∈ {0, . . . , s}, then let x′ = Ci and y′ = y. Clearly, h(x′) =
h(x)+ Ci−xλ
and h(y′) = h(y). Further, h(x′+y′) = h(x+y+Ci−x) ≥ h(x+y)+ Ci−xλ
since Ci ≤ x and the function h has the slope of 0 or 1λ. Therefore, we have that
h(x′) + h(y′) = h(x) + h(y) + Ci−xλ
< h(x+ y) + Ci−xλ
≤ h(x′ + y′).
147
2. If Ci−1 + λ ≤ x ≤ Ci for i ∈ {1, . . . , s}, then let x′ = Ci and y′ = y + x− Ci. Clearly,
h(x′) = h(x) and h(y′) ≤ h(y) since x ≤ Ci and h is nondecreasing. Therefore, we
have that h(x′) + h(y′) ≤ h(x) + h(y) < h(x+ y) = h(x′ + y′).
3. If Cs+λ ≤ x, then let x′ = Cs+λ and y′ = y. Clearly, h(x′) = h(x) and h(y′) = h(y).
Further, h(x′ + y′) = h(Cs + λ + y) = h(x + y) since y ≥ 0 and x + y ≥ Cs + λ.
Therefore, we have that h(x′) + h(y′) = h(x) + h(y) < h(x+ y) = h(x′ + y′).
We claim next that there exists x, y ∈ R+ with x = Ci and y = Cj for some i, j ∈{0, . . . , s} such that h(x) + h(y) < h(x+ y). Consider the following three cases:
1. If Cj ≤ y′ ≤ Cj + λ for j ∈ {0, . . . , s}, then let x = x′ and y = Cj. Clearly,
h(x) = h(x′) and h(y) = h(y′) + Cj−y′
λ. Further, h(x + y) = h(x′ + y′ + Cj − y′) ≥
h(x′ + y′) + Cj−y′
λsince Cj ≤ y′ and the function h has the slope of 0 or 1
λ. Therefore,
we have that h(x) + h(y) = h(x′) + h(y′) + Cj−y′
λ< h(x′ + y′) + Cj−y′
λ≤ h(x+ y).
2. If Cj−1 + λ ≤ y′ ≤ Cj for j ∈ {1, . . . , s}, then let x = x′ and y = Cj. Clearly,
h(x) = h(x′) and h(y) = h(y′). Further, h(x + y) ≥ h(x′ + y′) since y′ ≤ Cj and h is
nondecreasing. Therefore, we have that h(x) + h(y) = h(x′) + h(y′) < h(x′ + y′) ≤h(x+ y).
3. If Cs+λ ≤ y′, then let x = x′ and y = Cs+λ. Clearly, h(x) = h(x′) and h(y) = h(y′).
Further, h(x+ y) = h(x′ + Cs) = h(x′ + y′) since x′ ≥ 0 and x′ + y′ ≥ Cs. Therefore,
we have that h(x) + h(y) = h(x′) + h(y′) < h(x′ + y′) = h(x+ y).
We conclude that there exists i, j ∈ {0, . . . , s} such that h(Ci) + h(Cj) < h(Ci + Cj). Since
h(Ci) = i and h(Cj) = j, we have that i+ j < h(Ci+Cj). Since i+ j < h(Ci+Cj) and h is
nondecreasing, we conclude that Ci+j < Ci + Cj, which is a contradiction to the hypothesis
Ci + Cj ≤ Ci+j.
To prove the reverse implication, assume now that h is subadditive. We want to
prove that Ci + Cj ≤ Ci+j for 0 ≤ i ≤ j ≤ s with i + j ≤ s. As shown before, we can
take i and j such that h(Ci) = i and h(Cj) = j. Since h is subadditive, it follows that
148
i + j = h(Ci) + h(Cj) ≥ h(Ci + Cj). Since i + j ≤ s, we conclude that Ci+j ≥ Ci + Cj,
which proves the result.
We next describe a strong valid subadditive approximation of Φ(w). The proof
technique is similar to that used in Gu et al. [62].
Theorem 5.3. The function
Ψ(w) :=
i+ w−Bi
µif Bi ≤ w < Bi + µ, i = 0, . . . , r − 2,
i if Bi−1 + µ ≤ w < Bi, i = 1, . . . , r − 2,
r − 1 if Br−2 + µ ≤ w,
is a valid subadditive approximation of Φ(w) that is nondominated and maximal over R+.
Proof. Note that Ψ(w) = Φ(w) when w ∈ [Bi−1 + µ, Bi] for some i ∈ {1, . . . , r − 2} and
when w ≥ Br−2 + µ. Further,
Ψ(w) = Φ(w) +w −Bi
µ≥ Φ(w)
when w ∈ (Bi, Bi+µ). Next, we show that Ψ(w) is subadditive over R+. In Proposition 5.22,
let s = r − 2, Ci = Bi and λ = µ. Since Bi is the sum of the smallest i coefficients except
for the two coefficients a1 and a2 in the cover C, it is clear that Bi + Bj ≤ Bi+j for
0 ≤ i ≤ j ≤ r with i + j ≤ r. Therefore, Ψ(w) is subadditive over R+. Second, we
show that Ψ(w) is nondominated. Assume for a contradiction that there is another valid
subadditive approximation Ψ′(w) such that Φ(w) ≤ Ψ′(w) ≤ Ψ(w) for all w ≥ 0 and for
which there exists w′ ≥ 0 with Ψ′(w′) < Ψ(w′). Then, it must be that w′ ∈ (Bi, Bi + µ)
for some i ∈ {0, . . . , r − 2}. Let w′′ = Bi + µ − w′. Since 0 < w′′ < µ, we have that
0 < Ψ(w′′) < 1. Further,
Ψ(w′)+Ψ(w′′) = i+w′ −Bi
µ+w′′
µ= i+1 = Φ(Bi+ µ) = Ψ′(Bi+ µ) = Ψ′(w′+w′′). (5–67)
Hence, we obtain that
Ψ(w′′) = Ψ′(w′ + w′′)−Ψ(w′) ≤ Ψ′(w′) + Ψ′(w′′)−Ψ(w′) < Ψ′(w′′),
149
where the first equality holds because of (5–67), the first inequality holds because Ψ′
is subadditive and the second inequality holds because Ψ′(w′) < Ψ(w′). This is a
contradiction to the assumption that Ψ′(w) ≤ Ψ(w) for w ∈ R+. Finally, we prove
that Ψ(w) is maximal. Assume without loss of generality that N \ C = {r + 1, . . . , n}and that (xr+1, yr+1) is lifted first. Clearly, Φr+1(w) = Φ(w) for w ∈ R+. Assume that
Ψ(w) > Φ(w) for some w ≥ 0. Then, it suffices to show that there exists a coefficient ar+1
for which Φr+2(w) > Φ(w). We first claim that if Ψ(w) > Φ(w) for some w ≥ 0, then there
exists w′ ≥ 0 such that Φ(w) + Φ(w′) < Φ(w + w′). Since Ψ(w) > Φ(w) for some w ≥ 0,
it is clear that w ∈ (Bi, Bi + µ) for some i and Φ(w) = i. Let w′ = Bi + µ − w. Since
0 < w′ < µ, Φ(w′) = 0 and Φ(w+w′) = i+1. Hence, Φ(w)+Φ(w′) = i < i+1 = Φ(w+w′).
Next, assume that (xr+1, yr+1) is lifted first and that its coefficient ar+1 = w′. Then,
Φr+2(w) = max{Φ(w + w′)− Φ(w′),Φ(w)} = Φ(w + w′)− Φ(w′) > Φ(w), which shows the
result.
In Figure 5-4, we present the valid subadditive approximation Ψ(w) of Φ(w) obtained
using Proposition 5.3 for inequality (5–74) discussed in Example 5.6. Observe that, for
0 < w ≤ µ, approximation is exact only when w = µ, i.e., Ψ(µ) = Φ(µ). For w ≥ µ, it
is clear that approximation is exact when µ ≤ w ≤ B1 or w ≥ B1 + µ. Next, we obtain
a concave overestimator of Ψ(w) in Lemma 5.2 that will be used to compute the lifting
coefficients of the remaining variables in Theorem 5.4.
Lemma 5.2. Assume that ai > 0. Define
wi :=
0 if ai < µ,
j + 1 if Bj + µ ≤ ai < Bj+1 + µ, j = 0, . . . , r − 3,
r − 1 if Br−2 + µ ≤ ai.
Let W i0 = 0, W i
j = Bj + µ for j = 1, . . . , wi and W iwi+1 = ai. Further, redefine
awi+2 = ai − W iwi. Define ψi
0(w) =wµand ψi
j(w) = Ψ(W ij ) +
Ψ(W ij+1)−Ψ(W i
j )
aj+2(w − W i
j ) for
150
0 5 10 15 200
0.5
1
1.5
2
2.5
Φ(w)
Ψ(w)
µ B1 B1 + µ
w
Figure 5-4. A valid subadditive approximation Ψ(w) of Φ(w) for Example 5.6.
j = 1, . . . , wi. Then, the function
ψ(w) := min{ψij(w)
∣∣∣ j ∈ {0, . . . , wi}}
(5–68)
is a concave overestimator of Ψ(w).
Proof. First, ψ(w) is concave since it is obtained as the minimum of a finite number of
affine functions. If ψ(w) = ψi0(w) = w
µ, then ψi
0(w) clearly overestimates Ψ(w). Note
that the slope of ψij(w) is no less than that of ψi
j′(w) for j < j′ since aj+2 ≤ aj′+2 implies
1aj+2
≥ 1aj′+2
. Therefore, the minimum of (5–68) is attained at j = l if w ∈ [W il ,W
il+1] for
l ∈ {1, . . . , wi}. Further, since ψil(W
il ) = Ψ(W i
l ), ψil(W
il+1) = Ψ(W i
l+1), and Ψ(w) is convex
for w ∈ [W il ,W
il+1], we conclude that ψ(w) = ψi
l(w) ≥ Ψ(w).
The concave overestimator ψ(w) of Lemma 5.2 can be used to obtain lifting
coefficients. In order to determine whether the resulting inequality is facet-defining,
we introduce the following notation.
For i ∈ N \ C, we define I(ai) to be the function that returns 0 if Φ(ai) = Ψ(ai) and
returns 1 otherwise, i.e.,
I(ai) :=
0 if Bwi−1 + µ ≤ ai < Bwior ai ≥ Br−2 + µ,
1 if Bwi≤ ai < Bwi
+ µ or ai < µ.
151
Theorem 5.4. Under Assumptions (C1) and (C2),
∑j∈C
xj +∑
i∈N\Cαixi +
∑
i∈N\Cβiyi ≥ |C| − 1 (5–69)
defines a face of PB of dimension at least (2n− 1)−∑i∈N\C I(ai) when
(αi, βi) ∈(0,
aiµ
)∪ (Ψ(ai), 0)
wi⋃j=1
(Ψ(W i
j )−Ψ(W i
j+1)−Ψ(W ij )
aj+2
W ij ,Ψ(W i
j+1)−Ψ(W ij )
aj+2
ai
)
(5–70)
where µ, W ij and wi are as defined in Lemma 5.2. In particular, if, for all i ∈ N \ C,
(i) I(ai) = 0, or
(ii) I(ai) = 1, ai ≥ µ, and
(αi, βi) ∈(0,
aiµ
) wi−1⋃j=1
(Ψ(W i
j )−Ψ(W i
j+1)−Ψ(W ij )
aj+2
W ij ,Ψ(W i
j+1)−Ψ(W ij )
aj+2
ai
),
then (5–69) is facet-defining for PB.
Proof. It follows from Theorem 5.3 that Ψ(w) is a valid subadditive approximation of
Φ(w) for w ≥ 0. Hence, lifted inequalities will be valid whenever the lifting coefficients
(αi, βi) of (xi, yi) for i ∈ N \ C satisfy the condition
αixi + βiyi ≥ Ψ(aixiyi) ≥ Φ(aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–71)
Condition (5–71) can be restated as
βiφ ≥ Ψ(0) ≥ Φ(0) for 0 < φ ≤ 1, (5–72)
αi + βiφ ≥ Ψ(aiφ) ≥ Φ(aiφ) for 0 ≤ φ ≤ 1. (5–73)
To prove that a lifted inequality defines a face of PB of dimension at least (2n − 1) −∑
i∈N\C I(ai) under Assumption (5–70), we show that at least one point (xi, yi) is always
tight in (5–71) and that two points (xi, yi) are tight in (5–71) if Assumptions (i) or (ii)
holds. First, consider the case where (αi, βi) =(0, ai
µ
). Since µ > 0 and αi + βiφ = βiφ =
aiµφ ≥ Ψ(aiφ) ≥ Ψ(0) ≥ Φ(0), (5–72) and (5–73) are clearly satisfied. (5–71) is always
152
satisfied at equality for the point (1, 0). Further, if Assumptions (i) or (ii) holds, (5–69) is
facet-defining since another tight point is added in the lifting procedure. More precisely,
since Ψ(w) = wµwhere 0 ≤ w ≤ µ, we see that aiφ
µ= Ψ(aiφ) = Φ(aiφ) if φ = min{1, µ
ai},
which shows (5–71) is satisfied at equality for the point(1, µ
ai
). Next, consider the case
where (αi, βi) = (Ψ(ai), 0). It is easily verified that (5–71) is satisfied at equality for the
point (0, 1) since βi = 0. Finally, consider
(αi, βi) =
(Ψ(W i
j )−Ψ(W i
j+1)−Ψ(W ij )
aj+2
W ij ,Ψ(W i
j+1)−Ψ(W ij )
aj+2
ai
)
for i ∈ N \ C. Clearly, (αi, βi) satisfies (5–72) since βi ≥ 0. From Lemma 5.2, we have that
Φ(aiφ) ≤ Ψ(aiφ) ≤ Ψ(W ij ) +
Ψ(W ij+1)−Ψ(W i
j )
aj+2(aiφ−W i
j )
=(Ψ(W i
j )−Ψ(W i
j+1)−Ψ(W ij )
aj+2W i
j
)+
Ψ(W ij+1)−Ψ(W i
j )
aj+2aiφ
= αi + βiφ.
We also verify that (5–71) is satisfied at equality for the two points(1,
W ij
ai
)and
(1,
W ij+1
ai
)
except for the case where j = wi if I(ai) = 0. Therefore, we conclude that (5–69) defines
a face of PB of dimension at least (2n − 1) −∑i∈N\C I(ai) and is facet-defining for PB if
Assumptions (i) or (ii) is satisfied for all i ∈ N \ C.
Inequalities (5–69) can be facet-defining depending on the value of the coefficients ai
and the choice of lifting coefficients (αi, βi) for i ∈ N \ C. We mention that inequality
(5–69) can be facet-defining even if the assumptions (i) and (ii) of Theorem 5.4 are not
satisfied. We present an example with this property next. A similar example is given for
the 0−1 knapsack polytope by Gu et al. [62]; see the example following Theorem 6.
Example 5.6. For the bilinear set B in Example 5.2, consider the cover C = {3, 4, 5}.We can easily verify that C satisfy Assumptions (C1) and (C2). It follows from Proposi-
tions 5.11 that
x3 + x4 + x5 ≥ 2 (5–74)
153
is facet-defining for B({1, 2}, ∅, {1, 2}, ∅). Using the result of Proposition 5.22, the valid
subadditive approximation Ψ(w) of the lifting function Φ(w) is given by:
Ψ(w) =
w3
if 0 ≤ w < 3,
1 if 3 ≤ w < 17,
2− 20−w3
if 17 ≤ w < 20,
2 if 20 ≤ w.
Applying Theorem 5.4, we obtain the following nine inequalities
213y1
1417x1 +21
17y1
2x1
+
193y2
2124x2 +19
24y2
53x2
+ x3 + x4 + x5 ≥ 2,
which define faces of PB of dimension at least 8. Since 2 ∈ N \ C and a2 > µ with
I(a2) = 0, it follows from Theorem 5.4 that three inequalities
213y1
1417x1 +21
17y1
2x1
+19
3y2 + x3 + x4 + x5 ≥ 2
are facet-defining for PB. The two inequalities
21
3y1 +
2124x2 +19
24y2
53x2
+ x3 + x4 + x5 ≥ 2
are also facet-defining for PB although they do not satisfy the assumptions in Theo-
rem 5.4.
5.4 New Facet-Defining Inequalities for a Single-node Flow Model
In Section 5.3, we derived strong valid inequalities for the bilinear set B using lifting.
In this section, we show that many of these lifted inequalities are also facet-defining for the
154
convex hull of the single-node flow model without inflows
F =
{(x, y) ∈ {0, 1}n × [0, 1]n
∣∣∣n∑
j=1
ajyj ≥ d, xj ≥ yj ∀j ∈ N
}.
To eliminate uninteresting cases, we assume that∑n
j=1 aj > d + ai for all i ∈ N . In the
following lemma, we first show that F ⊆ B.
Lemma 5.3. The bilinear covering set B is a relaxation of the single-node flow set F .
Proof. We prove that F ⊆ B. Let (x, y) be an arbitrary point of F . Clearly, x ∈ {0, 1}n
and y ∈ [0, 1]n. It therefore remains to show that∑n
j=1 ajxjyj ≥ d. Let N0 = {j ∈ N |xj = 0} and N1 = {j ∈ N | xj = 1}. Since (x, y) ∈ F , yj = 0 for ∀j ∈ N0. Then,
n∑j=1
ajxjyj =∑j∈N1
ajyj =n∑
j=1
ajyj ≥ d,
where the last inequality holds because (x, y) ∈ F .
The general single-node flow set is important in mixed-integer programming since it
can be used as a relaxation of any inequality in a 0−1 mixed-integer program. Further,
these sets naturally arise in fixed-charge network problems; see [6, 61, 81, 83, 97]. The
single-node flow set F without inflows was first studied by Padberg et al. [97] under the
assumptions that (i) ai ≤ d and (ii)∑n
j=1 aj > d + ai for all i ∈ N . In particular, the
authors show that the inequalities∑n
j=1 ajyj ≥ d, xi ≥ yi, xi ≤ 1, and yi ≥ 0 for all
i ∈ N are facets for PF := conv(F ). In the remainder of this section, we refer to these
inequalities as trivial facets of PF . Padberg et al. [97] also prove the following result.
Lemma 5.4 (Proposition 8 in Padberg et al. [97]). If αx + βy ≥ δ is a nontrivial facet of
PF , then α ≥ 0, β ≥ 0, and δ > 0.
Proposition 5.23. Let αx + βy ≥ δ be a nontrivial facet of PF . Then, αx + βy ≥ δ is
also facet-defining for PB.
Proof. To prove αx + βy ≥ δ defines a facet of PB, it suffices to show that αx + βy ≥ δ is
valid for B since F ⊆ B by Lemma 5.3. If (x, y) ∈ F , then it is clear that αx + βy ≥ δ.
155
Therefore, consider the case where (x, y) ∈ B \ F . Let N0 = {j ∈ N | xj = 0} and
N1 = {j ∈ N | xj = 1}. We have that∑n
j=1 ajxjyj =∑
j∈N1ajyj ≥ d, yj ≤ 1 ∀j ∈ N1.
Further, yi > 0 for some i ∈ N0 since otherwise (x, y) ∈ F . Define (x, y) such that x = x,
yj = yj ∀j ∈ N1, and yj = 0 ∀j ∈ N0. Since (x, y) ∈ F , we obtain that αx + βy ≥ δ.
Further, since β ≥ 0 from Lemma 5.4, we obtain that αx+βy ≥ αx+βy ≥ δ. We conclude
that αx+ βy ≥ δ is valid for PB.
We note that Proposition 5.23 is slightly suprising in the light of Lemma 5.3 because
on one hand F ( B and on the other hand the nontrivial facets of PF are facets of PB.
In other words, the structure of PF can be determined from PB by including the trivial
facets of PF . As a consequence of Proposition 5.23, we exploit studies of PF to obtain
facets of PB.
Example 5.7. Consider the single-node flow set
F ={(x, y) ∈ {0, 1}4 × [0, 1]4
∣∣∣ 19y1 + 17y2 + 15y3 + 10y4 ≥ 20, xj ≥ yj, j = 1, . . . , 4},
corresponding with the bilinear covering set B discussed in Example 5.1. We obtained the
linear description of PF using PORTA. This linear description is given in the Appendix.
We observe that inequalities (5–9),(5–10),(5–16), and (5–17) are facets for both PB and
PF . However, it can be verified that inequalities (5–11),(5–12),(5–14), and (5–15) are
facet-defining for PB but not for PF .
We mention that the inequalities of PF described in the Appendix have been
numbered according to their counterparts in PB. For the set F , Padberg et al. [97]
derived a family of facet-defining inequalities that we describe in the following proposition.
Proposition 5.24. (Adapted from Proposition 12 in Padberg et al. [97]) Assume that (i)
C is a cover with excess µ =∑
j∈C aj − d such that a = maxj∈C aj > µ and (ii) L ⊆ N \ Csuch that 0 < a− µ < ak ≤ a for all k ∈ L and
∑j∈N\L aj > d+ a. Then
∑j∈C
(aj − µ)+xj +∑j∈L
(a− µ)xj +∑
j∈N\(C∪L)ajyj ≥
∑j∈C
(aj − µ)+ (5–75)
156
is facet-defining for PF .
We next show that inequalities (5–75) can be obtained as lifted bilinear cover
inequalities (5–47).
Proposition 5.25. Inequality (5–75) can be obtained as a lifted bilinear cover inequality
(5–47) of PB.
Proof. Let C and L ⊆ N\C be given that satisfy conditions (i) and (ii) of Proposition 5.24.
Observe that∑
j∈N\(C∪L) aj > a − µ since∑
j∈N\L aj > d + a. Define C = C, M = L,and T = N \ (C ∪ L). Clearly, µ = µ and (C,M, T ) is a partition of N that satisfies
Assumptions (A1), (A2), and (A3) in Theorem 5.1. We obtain from Assumption (ii) that
A1 − µ < ai ≤ A1 < A2 − µ for i ∈ M , which implies that qi = 1 for all i ∈ M in
Lemma 5.1. Further, since Qi1 = A1 − µ and Qi
2 = ai for i ∈ M , we can select (αi, βi) as
(A1 − µ, 0). Since A1 − µ = a− µ, we obtain that (5–75) is a lifted bilinear cover inequality
(5–47).
We next characterize the lifted bilinear cover inequalities of PB that also yield facets
for PF .
Theorem 5.5. A lifted bilinear cover inequality (5–47) is facet-defining for PF if and only
if
(αi, βi) ∈ (0, ai)
qi⋃j=1
(PC(Qi
j)−PC(Qi
j+1)− PC(Qij)
aj+1
Qij,PC(Qi
j+1)− PC(Qij)
aj+1
ai
)(5–76)
for all i ∈ M .
Proof. It is obvious that (5–47) is valid for F since F ⊆ B as described in Lemma 5.3.
We show that (5–47) is facet-defining for PF if (αi, βi) for i ∈ M are selected as in
condition (5–76). Recall that (5–47) is obtained in Section 5.3 by lifting the seed
inequality (5–35) which is facet-defining for PB(M,C \ {l},M,C \ {l}). We prove
first that (5–35) is facet-defining for PF (M,C \ {l},M,C \ {l}). Consider the points:
p0 = (el, el), pj = (el + ej, el) for j ∈ T , q0 =
(el,
al−µal
el
), q1 =
(∑j∈T ej, d
∑j∈T ej
)and
157
qk =(∑
j∈T ej, d∑
j∈T ej + ε( 1ak−1
ek−1 − 1akek)
)for k = 2, . . . , |T | where d = al−µ∑
j∈T ajand
ε > 0. It can be easily verified that these points belong to PB(M,C \ {l},M,C \ {l}) andare affinely independent. Further, they also belong to F since yj ≤ xj for all j ∈ T and
yl ≤ xl. Therefore, (5–35) is facet-defining for PF (M,C \ {l},M,C \ {l}). Now, it suffices
to show that sufficiently many of the tight points added when lifting variables (xi, yi) for
i ∈ M ∪C \{l}. When we lift the variables (xi, yi) fixed at (1, 1) for i ∈ C \{l} in the proof
of Proposition 5.15, we add the two affinely independent points (0, 0) and(1, (ai−µ)+
ai
)
that both belong to F . When lifting the variables (xi, yi) fixed at (0, 0) for i ∈ M in
Theorem 5.1, we add two points that differ depending on the choice of coefficients (αi, βi).
In the case where (αi, βi) = (0, ai), we add the two points (1, 0) and(1,min{1, A1−µ
ai}),
which belong to F . For the remaining cases in (5–76), we add the two points(1,
Qij
ai
)and
(1,
Qij+1
ai
)that both belong to F .
Next, we show that if (5–47) is facet-defining for PF , then (αi, βi) must be chosen
as in (5–76). It suffices to show that if (αi, βi) = (PC(ai), 0) for some i ∈ M such that
PC(ai) 6= PC(Qiqi), then (5–47) is not facet-defining for PF . We will do so by showing that
in this case (5–47) can be obtained by combining another inequality of the form (5–47) for
PF and trivial facets yi ≤ xi of PF . Assume for simplicity that only one pair (αm, βm) of
lifting coefficients are chosen to be (PC(am), 0). If qm = 0 (i.e., 0 < am < A1 − µ), then
inequality (5–47) defined with (αm, βm) = (0, am) is
amym +∑i∈C
(ai − µ)+xi +∑j∈T
ajyj +∑
i∈M\{m}αixi +
∑
i∈M\{m}βiyi ≥
∑i∈C
(ai − µ)+ (5–77)
and is facet-defining for PF . Since PC(am) = am for 0 < am < A1 − µ, the inequality
(5–47) with (αm, βm) = (PC(am), 0) can be obtained by combining the inequality (5–77)
and am(xm − ym) ≥ 0. If qm > 0, then we have that Qmqm = Aqm − µ, Qm
qm+1 = am, and
amqm+1 = am −Qmqm . When Aqm < am < Aqm+1 − µ, inequality (5–47) reduces to:
(PC(Qm
qm)−PC(Qm
qm+1)− PC(Qmqm)
aqm+1
Qmqm
)xm
158
+
(PC(Qm
qm+1)− PC(Qmqm)
aqm+1
am
)ym +
∑i∈C
(ai − µ)+xi +∑j∈T
ajyj (5–78)
+∑
i∈M\{m}αixi +
∑
i∈M\{m}βiyi ≥
∑i∈C
(ai − µ)+
and is facet-defining for PF . Further, we can obtain inequality (5–47) with (αm, βm) =
(PC(am), 0) by combining (5–78) and
(PC(Qm
qm+1)− PC(Qmqm)
aqm+1
am
)(xm − ym) ≥ 0,
since PC(am) − PC(Qmqm) > 0. If there exist several m ∈ M with (αm, βm) = (PC(am), 0),
then we can apply the same idea for each m since the coefficients other than (xm, ym) are
unchanged at each step of the argument.
We show in the following example that the family of lifted bilinear cover inequalities
is larger than (5–75).
Example 5.8. As established in Example 5.7, (5–9) and (5–10) are facet-defining
lifted bilinear cover inequalities (5–47) for both PB and PF that are obtained by choos-
ing (C,M, T ) = ({3, 4}, {1}, {2}) and (C,M, T ) = ({2, 4}, {1}, {3}) respectively in
Theorem 5.1. However, (5–9) and (5–10) cannot be obtained using the results of Propo-
sition 5.24 since in (5–75) at most one of the coefficients of xi and yi is nonzero for all
i ∈ N .
Other families of inequalities are known for PF . In particular, Gu et al. [61] studied
the general single-node flow set
G ={(x, y) ∈ {0, 1}n × [0, 1]n
∣∣∣∑
j∈N+
ajyj −∑
j∈N−ajyj ≤ d, xj ≥ yj ∀j ∈ N
},
where N = N+ ∪ N− and n = |N |. The flow set F we study can be obtained from G
by restricting the set of inflows to be empty, i.e., setting N+ = ∅ and d = −d. Using
sequence-independent lifting techniques, the authors derived two families of strong valid
inequalities for G. Among them, only one applies to F . We investigate this family of lifted
159
simple generalized inequalities (LSGFCI) next. We first show in Proposition 5.27 that
LSGFCIs are facet-defining for PF under an additional assumption. We then derive these
inequalities from lifted reverse bilinear cover inequalities.
We now briefly review the work of Gu et al. [61]. For the general single-node flow
set G, a set C = C+ ∪ C− is called a generalized cover if C+ ⊆ N+, C− ⊆ N−, and∑
j∈C+ aj −∑
j∈C− aj = d + λ with λ > 0; see Van Roy and Wolsey [127] and Nemhauser
and Wolsey [91]. For the special case where N+ = ∅ in G, a generalized cover of F is
defined as C ⊆ N such that∑
j∈C aj = d − λ with λ > 0. Given a generalized cover C for
F , we obtain the following single-node flow model
F 0 =
(x, y) ∈ {0, 1}n × [0, 1]n
∣∣∣∑
j∈N\Cajyj ≥ d−
∑j∈C
aj = λ, xj ≥ yj ∀j ∈ N \ C ,
by fixing the variables (xj, yj) to (1, 1) for all j ∈ C. This set is denoted by X0 in Gu
et al. [61]. Note that F 0 is full-dimensional since F is assumed to be full-dimensional, i.e.,∑
j∈N\C aj =∑
j∈N aj − d + λ > ai + λ for all i ∈ N . The simple generalized flow cover
inequality (SGFCI)∑j∈L
λxj +∑
j∈N\(C∪L)ajyj ≥ λ (5–79)
is not always facet-defining for conv(X0) where L = {j ∈ N \ C | aj > λ}. InProposition 5.26 below, we prove that (5–79) is facet-defining for PF under the condition
that∑
j∈N\L aj > d.
Proposition 5.26. The simple generalized flow cover inequality (SGFCI)
∑j∈L
dxj +∑
j∈N\Lajyj ≥ d (5–80)
is valid for PF where L = {j ∈ N | aj > d}. Further, (5–80) is facet-defining for PF if∑
j∈N\L aj > d.
Proof. Validity follows from Corollary 4 in Van Roy and Wolsey [127]. We prove that
(5–80) is facet-defining if∑
j∈N\L aj > d. Consider first the case where L = ∅. Then,
160
the result is obvious since (5–80) is one of the trivial facets of PF . If L 6= ∅, then we
show that (5–80) is facet-defining for PF using the same arguments as in the proof of
Proposition 5.9. In particular, consider the 2n points pl, pl for all l ∈ L and qk, qk for all
k ∈ N \ L used in the proof of Proposition 5.9. It can be easily verified that these points
belong to F . It follows that (5–80) is facet-defining for PF .
Proposition 5.27 (Extended from Theorem 12 in Gu et al. [61]). Assume that (i) C ⊆ N
is a generalized cover for F such that∑
j∈C aj = d − λ with λ > 0 and (ii) L 6= ∅ and∑
j∈N\L aj > d where L = {j ∈ N \ C | aj > λ}. Assume also that L = {j1, j2, . . . , jr} with
aji ≥ aji+1for i = 1, . . . , r − 1. Let r = |L|, A0 = 0, and Ai =
∑ik=1 ajk for i = 1, . . . , r.
Further, let d′ =∑
j∈N\C aj − λ. Define
f(z) =
iλ if Ai ≤ z ≤ Ai+1 − λ, i = 0, . . . , r − 1,
z − Ai + iλ if Ai − λ ≤ z ≤ Ai, i = 1, . . . , r − 1,
z − Ar + rλ if Ar − λ ≤ z ≤ d′.
(5–81)
Then, the lifted simple generalized flow cover inequality (LSGFCI)
∑j∈L
λxj +∑j∈C
f(aj)xj +∑
j∈N\(C∪L)ajyj ≥ λ+
∑j∈C
f(aj) (5–82)
is facet-defining for PF .
Proof. Given a generalized cover C for F , after fixing (xj, yj) = (1, 1) for all j ∈ C,we obtain from Proposition 5.26 that (5–79) is facet-defining for conv(F 0). To lift the
variables (xi, yi) for i ∈ C, we compute the lifting function f(z) associated with (5–79) as
f(z) = min −λ+
∑
j∈Lλxj +
∑
j∈N\(C∪L)ajyj
s.t.∑
j∈N\Cajyj ≥ λ+ z
yj ≤ xj, ∀j ∈ N \ C,
xj ∈ {0, 1}, yj ∈ [0, 1], ∀j ∈ N \ C.
161
Since L 6= ∅, it follows from Theorem 10 in Gu et al. [61] that the lifting function f(z)
is given by (5–81). Further, f(z) is superadditive over R+ since the condition (1) of
Corollary 2 in Gu et al. [61] holds. Therefore, we conclude from Proposition 5.13 that
(5–82) is facet-defining for PF .
We show next that inequalities of the form (5–82) for PF are lifted reverse bilinear
cover inequalities for PB. The proof uses the observation that a cover C of the bilinear set
B can be obtained from a generalized cover C of the flow set F by adding one element l in
L, i.e., C = C ∪ {l} where l ∈ L.Proposition 5.28. Inequality (5–82) for PF can be obtained as a lifted reverse bilinear
cover inequality (5–59) of PB.
Proof. For a given generalized cover C of F , we define a cover C of B as C = C ∪ {l}for l ∈ L since L 6= ∅ and
∑j∈C aj = d + µ > d where µ = al − λ > 0. The cover
C satisfies Assumption (i) in Theorem 5.2 because al − µ = λ > 0 for l ∈ C. Further,
since L ⊆ M , we can set M = L \ {l} in (5–59). Assumption (ii) also holds since∑
j∈N\(L\{l}) aj − d =∑
j∈N\L aj + al − d > 0. Next, we observe that C ∪M = C ∪ L and
that min{ai, al−µ} = al−µ = λ for all i ∈ M . Substituting al−µ = λ in Proposition 5.19,
we obtain that PM(w) = g(w) since M ∪ {l} = L. Therefore, we conclude that (5–82) can
be obtained as a lifted reverse bilinear cover inequality (5–59).
5.5 Concluding Remarks
In this chapter, we studied the polyhedral structure of the 0−1 mixed-integer bilinear
covering set. We described the convex hull of this set when n = 2. We also presented
three families of lifted inequalities obtained using sequence-independent lifting. Among
them, two families have an exponential number of members. We also studied the relations
between 0−1 mixed-integer bilinear covering sets and single-node flow sets without inflows.
In particular, we showed that inequalities for bilinear sets are valid for flow sets and we
proved that all nontrivial facets of PF can be obtained through the study of PB. We then
162
showed that the inequalities we derived generalize classical families of lifted flow cover
inequalities for PF .
163
CHAPTER 6A COMPUTATIONAL STUDY OF LIFTED INEQUALITIES FOR 0-1
MIXED-INTEGER BILINEAR COVERING SETS 1
6.1 Introduction
In this chapter, we seek to evaluate the quality of the valid inequalities we derived in
Chapter 5 for problems that contain bilinear covering constraints of the form
n∑j=1
ajxjyj ≥ d (6–1)
where aj > 0, xj ∈ {0, 1}, and yj ∈ [0, 1] for j = 1, . . . , n. To this end, we will
consider several randomly generated instances that we will solve with branch-and-bound
with and without the addition of our cuts. In Section 6.2, we describe a set S that is a
generalization of the bilinear covering set B that includes additional linear terms. This
set appears naturally during the branch-and-bound process as we describe in Section 6.2.
We then show that two families of inequalities we derived in Chapter 5 have natural
counterparts for S. This extends the applicability of our results from the root node of the
branch-and-bound tree to any node inside of the tree. In Section 6.3, we describe a family
of randomly generated problems that we use to test the strength of our cuts. We then
present the result of a computational study on these instances. In particular, we compare
our results to those obtained when linearizing the bilinear terms. In Section 6.4, we give
concluding remarks.
6.2 Generalization to Bilinear Constraints with Linear Terms
In practice, there are two common ways of using cuts inside of a branch-and-bound
framework. In the first, cutting planes are only added at the root node of the tree. This
variant is often referred to as cut-and-branch. In the second, cutting planes are added
throughout the tree. The generic term branch-and-cut is usually reserved for this variant.
1 The material of this chapter is based on [33].
164
Note that, if we aim to design a cut-and-branch algorithm for problems containing
(6–1), the results of Chapter 5 are sufficient. However, if our goal is to apply cuts inside of
a branch-and-cut framework, then we must also investigate what becomes of the constraint
(6–1) inside of the tree.
Assume for example that we decide to branch on variable x1 at the root node. The
two branches created will now have the restrictions: (i) x1 = 0 and (ii) x1 = 1. In the
former case, (6–1) reduces ton∑
j=2
ajxjyj ≥ d,
which has the same form as (6–1). However, in the latter case, after branching, constraint
(6–1) will be of the formn∑
j=2
ajxjyj + a1y1 ≥ d. (6–2)
which does not follow the template set in (6–1).
Similarly, when branching on the continuous variable y1 at branching point ω ∈ (0, 1),
we obtain two branches where (i) y1 ≤ ω and (ii) y1 ≥ ω. In the former case, after
re-scaling the variable yj, we can write
ωa1x1y1 +n∑
j=2
ajxjyj ≥ d,
where y1 = y1ω
and 0 ≤ y1 ≤ 1. This constraint is of the form (6–1). When branching on
y1 ≥ ω, we introduce the new variable y1 =y1−ω1−ω
. Substituting in (6–1), we obtain
a1x1
((1− ω)y1 + ω
)+
n∑j=2
ajxjyj ≥ d,
i.e.,
(1− ω)a1x1y1 +n∑
j=2
ajxjyj + ωa1x1 ≥ d, (6–3)
where 0 ≤ y1 ≤ 1. This expression again does not conform with (6–1). This suggests that,
when using branch-and-cut, it is useful to consider the generalization of the constraint
165
(6–1) that is given by∑j∈J
(ajxjyj + bjxj) +∑j∈I
ajyj ≥ d (6–4)
where N = I ∪ J . We assume that J 6= ∅ but I might be empty. Note that (6–4) contains
linear terms in both x and y. The linear terms in yj are not associated with any bilinear
term xjyj as can be observed in (6–2). Linear terms in xj however always appear with a
corresponding bilinear term xjyj as can be observed in (6–3).
As a result, we consider in this section the set
S :=
{(x, y) ∈ {0, 1}n × [0, 1]n+m
∣∣∣∑j∈J
(ajxjyj + bjxj) +∑j∈I
ajyj ≥ d
},
where J = {1, . . . , n}, I = {n + 1, . . . , n + m}, aj > 0, bj ≥ 0, and d > 0. Similar to
Chapter 5, we require
Assumption 6.1.∑
j∈J(aj + bj) +∑
j∈I aj ≥ d+ ai + bi for all i ∈ J .
We denote the convex hull of the set S as PS := conv(S). Assumption 6.1 guarantees
that PS is full-dimensional, i.e., dim(PS) = 2n + m. We now derive facet-defining
inequalities for PS using sequence-independent lifting. Proposition 6.1 describes the basic
constraint we use to generate the seed inequality of our lifting procedures.
Proposition 6.1. Assume that (i) a1 + b1 > d, (ii)∑
j∈J\{1}(aj + bj) +∑
j∈I aj > d, and
(iii)∑
j∈J\{1} bj < d. Then,
d−
∑
j∈J\{1}bj
x1 +
∑
j∈J\{1}ajyj +
∑j∈I
ajyj ≥ d−∑
j∈J\{1}bj (6–5)
is facet-defining for PS.
Proof. We first show that (6–5) is valid for S. Assume for a contradiction that there exists
(x′, y′) ∈ S such that
d−
∑
j∈J\{1}bj
x′
1 +∑
j∈J\{1}ajy
′j +
∑j∈I
ajy′j < d−
∑
j∈J\{1}bj.
166
Clearly, x′1 = 0. It follows that
d >∑
j∈J\{1}(ajy
′j+bj)+
∑j∈I
ajy′j ≥
∑
j∈J\{1}(ajx
′jy
′j+bjx
′j)+
∑j∈I
ajy′j =
∑j∈J
(ajx′jy
′j+bjx
′j)+
∑j∈I
ajy′j.
This is a contradiction to the fact that (x′, y′) ∈ S.
Next, we prove that (6–5) is facet-defining for PS by providing 2n + m points in S
satisfying (6–5) at equality such that the solutions (α, β, δ) to the system αxi + βyi = δ
for i = 1, . . . , 2n + m yield inequalities αx + βy ≥ δ that are scalar multiples of (6–5).
Consider the two points p1 = (e1, e1) and q1 = (e1, (1 − ε)e1) where ε > 0 is sufficiently
small. Clearly, p1 and q1 belong to S because of (i) and satisfy (6–5) at equality. From p1
and q1, we obtain that α1 + β1 = δ and α1 + (1 − ε)β1 = δ, which implies that α1 = δ and
β1 = 0. Next, we define the n − 1 points pk = (e1 + ek, e1) for k = 2, . . . , n. Finally, let
d =d−∑
j∈J\{1} bj∑j∈I∪J\{1} aj
. From assumptions (ii) and (iii), it is easily verified that 0 < d < 1. For
l = 2, . . . , n+m, we construct the n+m− 1 points
ql =
∑
j∈J\{1}ej, d
∑
j∈I∪J\{1}ej
when l = 2
and
ql =
∑
j∈J\{1}ej, d
∑
j∈I∪J\{1}ej + ε
(1
al−1
el−1 − 1
alel
) when l ≥ 3
where ε > 0 is sufficiently small. It can be verified that the points pk and ql belong to S
and satisfy (6–5) at equality. From pk for k = 2, . . . , n, we obtain that α1 + αk + β1 = δ,
which implies that αk = 0 for k = 2, . . . , n since α1 = δ and β1 = 0. Further, using the
points ql for l = 2, . . . , n+m, we obtain the system of equations:
∑
j∈J\{1}αj + d
∑
j∈I∪J\{1}βj = δ, (6–6)
∑
j∈J\{1}αj + d
∑
j∈I∪J\{1}βj + ε
(βl−1
al−1
− βl
al
)= δ, l = 3, . . . , n+m. (6–7)
By subtracting (6–6) from (6–7), we conclude that there exists θ such that βl−1
al−1= βl
al= θ
for l = 3, . . . , n+m. Substituting αk = 0 for k = 2, . . . , n and βl = θal for l = 2, . . . , n+m
167
in (6–6), we obtain that dθ∑
j∈I∪J\{1} aj = δ, i.e., θ = δd−∑
j∈J\{1} bj. It follows that
βl =δ
d−∑j∈J\{1} bj
al for all l = 2, . . . , n +m. Therefore, we conclude that α1 = δd−∑n
j=2 bjd,
β1 = 0, αk = 0 for k = 2, . . . , n, and βl =δ
d−∑nj=2 bj
al for l = 2, . . . , n + m, which proves
that (6–5) is facet-defining for PS.
Observe that, when I = ∅ and bj = 0 for j ∈ J , (6–5) is an inequality of the form
(5–23). We will use (6–5) to construct the seed inequality of our lifting procedure in a way
that is analogous to that we used in Chapter 5.
In the ensuing discussions, we use the following notation. For J0, J1 ⊆ J with
J0 ∩ J1 = ∅, J0, J1 ⊆ J with J0 ∩ J1 = ∅, and I1 ⊆ I, we define
S(J0, J1; J0, J1, I1) :=
(x, y) ∈ S
∣∣∣∣∣∣∣∣∣∣
xj = 0 for j ∈ J0, xj = 1 for j ∈ J1,
yj = 0 for j ∈ J0, yj = 1 for j ∈ J1,
yj = 1 for j ∈ I1
.
To build a seed inequality of the form (6–5), we again use the concept of a cover and
adapt it for the set S as follows.
Definition 6.1. We say that C ⊆ J is a cover for S if∑
j∈C(aj + bj) > d. Further, we
define the excess of C as µ =∑
j∈C(aj + bj)− d > 0.
To generate lifted inequalities for PS, we partition the set J into (C,M, T ) and the
set I into (I0, I1) so that the following assumptions are satisfied;
(A1) C is a cover for S with excess µ,
(A2) Al − µ >∑
j∈T bj +∑
j∈I1 aj where l ∈ argmax{aj + bj | j ∈ C} and Al = al + bl,
(A3)∑
j∈C∪T (aj + bj) +∑
j∈I aj > d+Al.
To derive lifted inequalities from (C,M, T ) and (I0, I1), we fix the variables (xj, yj)
for j ∈ M to (0, 0), the variables (xj, yj) for j ∈ C \ {l} to (1, 1), and the variables yj for
j ∈ I1 to 1. The resulting set S(M,C \ {l};M,C \ {l}, I1) is defined by the inequality
alxlyl + blxl +∑j∈T
(ajxjyj + bjxj) +∑j∈I0
ajyj ≥ d−∑
j∈C\{l}(aj + bj)−
∑j∈I1
aj = Al − µ−∑j∈I1
aj.
168
Note that the right-hand-side of this expression is nonnegative because of (A2). It follows
from Assumption (A3) that
∑j∈T
(aj + bj) +∑j∈I0
aj > d+Al −∑j∈C
(aj + bj)−∑j∈I1
aj = Al − µ−∑j∈I1
aj.
Using the result of Proposition 6.1 with Assumption (A2), we obtain that
(Al − µ)xl +∑j∈T
ajyj +∑j∈I0
ajyj ≥ Al − µ (6–8)
where µ = µ +∑
j∈T bj +∑
j∈I1 aj is facet-defining for PS(M,C \ {l};M,C \ {l}, I1).We now lift (6–8) to construct the seed inequality. We first reintroduce the continuous
variables yj for j ∈ I1 in (6–8). The lifting function corresponding to (6–8) is defined as:
L(w) := max (Al − µ)−{(Al − µ)xl +
∑j∈T
ajyj +∑j∈I0
ajyj
}
s.t. alxlyl + blxl +∑j∈T
(ajxjyj + bjxj) (6–9)
+∑j∈I0
ajyj ≥ Al − µ−∑j∈I1
aj − w
xj ∈ {0, 1} j ∈ {l} ∪ T, yj ∈ [0, 1] j ∈ {l} ∪ T ∪ I0.
Next, we derive a closed-form expression for the function L(w).
Proposition 6.2. The lifting function L(w) is given by
L(w) =
−∞ if w < −a− µ
w + µ if −a− µ ≤ w < −µ
0 if −µ ≤ w < 0
w if 0 ≤ w < Al − µ
Al − µ if Al − µ ≤ w,
where a =∑
j∈T aj +∑
j∈I0 aj and µ = µ+∑
j∈T bj +∑
j∈I1 aj.
Proof. Observe that there exists an optimal solution (x∗, y∗) that satisfies x∗j = 1 for j ∈ T
and y∗l = 1 since the coefficients of xj for j ∈ T and yl in the objective are zero. Hence,
169
L(w) can be rewritten as
L(w) = max (Al − µ)−{(Al − µ)xl +
∑j∈T
ajyj +∑j∈I0
ajyj
}
s.t. Alxl +∑j∈T
ajyj +∑j∈I0
ajyj ≥ Al − µ− w (6–10)
xl ∈ {0, 1}, yj ∈ [0, 1] j ∈ T ∪ I0.
Now, define y =∑
j∈T ajyj+∑
j∈I0ajyj
a. Since 0 ≤ y ≤ 1, we can further simplify L(w) as:
L(w) = max (Al − µ)− {(Al − µ)xl + ay}
s.t. Alxl + ay ≥ Al − µ− w
xl ∈ {0, 1}, y ∈ [0, 1].
This problem has the same structure as (5–37) in Proposition 5.14. Its optimal value can
be obtained similarly and yields the given expression for L(w).
Lemma 6.1. L(w) is subadditive over R− and R+ respectively.
Proof. By setting al = Al, µ = µ, and∑
j∈T aj = a in Proposition 5.14, we conclude that
L(w) is subadditive over R− and R+.
We next lift the variables yj for j ∈ I1 from 1. Lifting is simple to perform since L(w)
is subadditive over R−.
Proposition 6.3. Under Assumptions (A1), (A2), and (A3), the inequality
(Al − µ)xl +∑j∈T
ajyj +∑j∈I0
ajyj ≥ Al − µ (6–11)
is facet-defining for PS(M,C \ {l};M,C \ {l}, ∅).
Proof. We know from Proposition 6.1 that (6–8) is facet-defining for PS(M,C \ {l};M,C \{l}, I1). Since L(w) is subadditive over R−, lifting coefficients βi for yi where i ∈ I1 are
valid if
βi(yi − 1) ≥ L(aiyi − ai) for 0 ≤ yi < 1. (6–12)
170
Since L(w) ≤ 0 for w ≤ 0, L(aiyi − ai) ≤ 0 for all yi ∈ [0, 1). Therefore, coefficients βi = 0
satisfy condition (6–12). Further, (6–12) is satisfied at equality at the point y∗i = ai−µai
.
Therefore, we conclude that (6–11) is facet-defining for PS(M,C \ {l};M,C \ {l}, ∅).
We next compute the lifting function associated with (6–11)
LI(w) := max (Al − µ)−{(Al − µ)xl +
∑j∈T
ajyj +∑j∈I0
ajyj
}
s.t. alxlyl + blxl +∑j∈T
(ajxjyj + bjxj) +∑j∈I
ajyj ≥ Al − µ− w
xj ∈ {0, 1} j ∈ {l} ∪ T, yj ∈ [0, 1] j ∈ {l} ∪ T ∪ I.
Observe that this lifting function is very similar in structure to L(w) presented in
(6–9). Therefore, we can adapt the result of Proposition 6.2 as follows.
Lemma 6.2.
LI(w) =
−∞ if w < −a− µ
w + µ if −a− µ ≤ w < −µ
0 if −µ ≤ w < 0
w if 0 ≤ w < Al − µ
Al − µ if Al − µ ≤ w.
Proof. Observe that there exists an optimal solution (x∗, y∗) such that x∗j = 1 for j ∈ T ,
y∗l = 1, and yj = 1 for j ∈ I1 since the corresponding coefficients of xj for j ∈ T , yl, and yj
for j ∈ I1 in the objective are zero. Hence, LI(w) is rewritten as
LI(w) = max (Al − µ)−{(Al − µ)xl +
∑j∈T
ajyj +∑j∈I0
ajyj
}
s.t. Alxl +∑j∈T
ajyj +∑j∈I0
ajyj ≥ Al − µ− w (6–13)
xl ∈ {0, 1}, yj ∈ [0, 1] j ∈ T ∪ I0.
Note that this problem has the form of (6–10) and therefore, optimal solutions can be
obtained in the same way. The results follows.
171
Because the result of Lemma 6.2 is obtained similarly as that of Proposition 6.2, we
conclude from Lemma 6.1 that LI(w) is subadditive over R− and R+.
Lemma 6.3. LI(w) is subadditive over R− and R+ respectively.
We now obtain two different families of inequalities by reintroducing the variables
(xj, yj) for j ∈ M ∪ C \ {l} in different orders.
6.2.1 Generalized Lifted Bilinear Cover Inequalities
To obtain a generalized lifted bilinear cover inequality from the seed inequality (6–11),
we will lift the variables in C \ {l} before lifting the variables in M . Lifting the variables
(xi, yi) for i ∈ C \ {l} is simple since LI(w) is subadditive over R−.
Proposition 6.4. Under Assumptions (A1), (A2), and (A3),
∑j∈C
(aj + bj − µ)+xj +∑j∈T
ajyj +∑j∈I0
ajyj ≥∑j∈C
(aj + bj − µ)+ (6–14)
is facet-defining for PS(M, ∅;M, ∅, ∅).
Proof. We know from Proposition 6.3 that (6–11) is facet-defining for PS(M,C\{l};M,C\{l}, ∅). Since LI(w) is subadditive over R−, lifting coefficients (αi, βi) of variables (xi, yi)
for i ∈ C \ {l} are valid if they satisfy the condition
αi(xi − 1) + βi(yi − 1) ≥ LI((aixiyi + bixi)− (ai + bi)
)for (xi, yi) ∈ {0, 1} × [0, 1] \ {1, 1}.
(6–15)
Condition (6–15) can be rewritten as:
βi ≤ inf0≤φ<1
−LI(aiφ− ai)
1− φ, (6–16)
αi + sup0≤φ≤1
βi(1− φ) ≤ −LI(−(ai + bi)
). (6–17)
Since L(w) ≤ 0 for w ≤ 0, condition (6–16) is satisfied when βi = 0. In addition, if we
choose αi = −LI(−(ai + bi)
), condition (6–17) is satisfied as
αi + sup0≤φ≤1
βi(1− φ) = αi = −LI(−(ai + bi)
).
172
We obtain that αi = (ai+bi− µ)+ using the expression of LI(w) given in Lemma 6.2. Since
(6–15) is satisfied at equality at the points (0, 0) and(1, ai−µ
ai
), (6–14) is facet-defining for
PS(M, ∅;M, ∅, ∅).
To obtain facet-defining inequalities for PS, we next lift the remaining variables
(xi, yi) for i ∈ M . The lifting function LC(w) corresponding to (6–14) is defined as
LC(w) := max∑j∈C
(aj + bj − µ)+ −{∑
j∈C(aj + bj − µ)+xj +
∑j∈T
ajyj +∑j∈I0
ajyj
}
s.t.∑
j∈C∪T(ajxjyj + bjxj) +
∑j∈I
ajyj ≥∑i∈C
(ai + bi)− µ− w (6–18)
xj ∈ {0, 1} j ∈ C ∪ T, yj ∈ [0, 1] j ∈ (C ∪ T ) ∪ I.
We now derive a closed-form expression for LC(w). We assume without loss of generality
that C = {1, . . . , p} and a1 + b1 ≥ a2 + b2 ≥ . . . ≥ ap + bp. Let q ∈ C be such that
aq + bq > µ ≥ aq+1 + bq+1. Further, define E0 = 0 and Ei =∑i
j=1(aj + bj) for all i ∈ C.
Note that Ep =∑p
j=1(aj + bj) = d+ µ.
Proposition 6.5. For w ≥ 0,
LC(w) =
w − iµ if Ei ≤ w < Ei+1 − µ, i = 0, . . . , q − 1,
Ei − iµ if Ei − µ ≤ w < Ei, i = 1, . . . , q − 1,
Eq − qµ if Eq − µ ≤ w,
where µ = µ+∑
j∈T bj +∑
j∈I1 aj.
Proof. First, observe that there exists an optimal solution (x∗, y∗) of (6–18) in which
x∗j = 1 for j ∈ T and y∗j = 1 for j ∈ C ∪ I1 since the corresponding objective coefficients
are zero. Since aq + bq > µ ≥ aq+1 + bq+1 for q ∈ C, we have that (aj + bj − µ)+ = 0 for
j = q + 1, . . . , p, which also implies that there exists an optimal solution (x∗, y∗) in which
x∗j = 1 for j = q + 1, . . . , p. Therefore, using the same notation a =
∑j∈T aj +
∑j∈I0 aj and
173
y =∑
j∈T ajyj+∑
j∈I0ajyj
aas in Proposition 6.2, LC(w) can be restated as:
LC(w) = max
q∑j=1
(aj + bj − µ)−{
q∑j=1
(aj + bj − µ)xj + ay
}
s.t.
q∑j=1
(aj + bj)xj + ay ≥q∑
j=1
(aj + bj)− µ− w
xj ∈ {0, 1} j = 1, . . . , q, y ∈ [0, 1].
This lifting function has the form of (5–43). Therefore, the proof of Proposition 5.16 can
be followed to obtain the result.
Using the result of Corollary 5.1, we can easily verify that LC(w) is subadditive over
R+. The result of Proposition 6.5 is illustrated on the following example.
Example 6.1. Consider the set S defined as
(9y1 + 12)x1 + (14y2 + 5)x2 + (15y3 + 2)x3 + (10y4 + 5)x4 + (7y5 + 3)x5 + 6y6 + y7 ≥ 23.
Assume that partitions of J and I are given as (C,M, T ) = ({4, 5}, {1, 2}, {3}) and(I0, I1) = ({6}, {7}) respectively. It can be easily verified that these partitions satisfy
Assumptions (A1)-(A3) since C is a cover with µ = 2, Al − µ = 15− 2 > 2 + 1 = b3 + a7,
and∑
j∈C∪T (aj + bj)+∑
j∈I aj = (10+5+7+3+15+2)+(6+1) = 51 > 23+15 = d+Al.
We obtain from Proposition 6.4 that the inequality
15y3 + 10x4 + 5x5 + 6y6 ≥ 15 (6–19)
is facet-defining for PS(M, ∅;M, ∅, ∅). Using the result of Proposition 6.5, the lifting
function LC(w) is computed as
LC(w) =
w if 0 ≤ w < 15− 5 = 10,
10 if 10 ≤ w < 15,
w − 5 if 15 ≤ w < 15 + 10− 5 = 20,
20− 5 if 20 ≤ w.
174
Function LC(w) is represented in Figure 6-1.
0 5 10 15 200
2
4
6
8
10
12
14
16LC(w)
E1 − µ E1 E2 − µ
w
Figure 6-1. Lifting function LC(w) of (6–19)
Next, we compute the lifting coefficients of variables (xi, yi) for i ∈ M using LC(w).
Similar to the derivation of lifted bilinear cover inequalities (5–47), lifting coefficients
(αi, βi) for i ∈ M must be chosen to satisfy
αixi + βiyi ≥ LC(aixiyi + bixi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (6–20)
For the variables (x1, y1) of Example 6.1, LC(aixiyi + bixi) is represented in Figure 6-2
(a). Observe from Condition (6–20) that lifting coefficients (α1, β1) must be chosen in
such a way that the plane αixi + βiyi overestimates LC(aixiyi + bixi). Using a geometric
interpretation similar to that used in Lemma 5.1, we obtain several possible overestimating
planes as shown in Figure 6-2 (b).
To obtain an overestimating plane αixi+βiyi, we next describe a concave overestimator
of LC(w) over [bi, ai + bi].
Lemma 6.4. Assume that ai > 0 and bi ≥ 0. Define qi and ri as
qi :=
0 if 0 ≤ bi < E1 − µ,
i if Ei − µ ≤ bi ≤ Ei+1 − µ, i = 1, . . . , q − 1,
q if Eq − µ < bi.
175
0
0.5
1
0
0.5
10
5
10
15
x
LC(9y + 12)
y 0
0.5
1
0
0.5
10
5
10
15
x
LC(9y + 12)
y
(a) (b)
Figure 6-2. Deriving lifting coefficients for Example 6.1
and
ri :=
0 if 0 ≤ ai + bi < E1 − µ,
i if Ei − µ ≤ ai + bi ≤ Ei+1 − µ, i = 1, . . . , q − 1,
q if Eq − µ < ai + bi.
Let Qi0 = bi, Q
ij = Eqi+j − µ for j = 1, . . . , ri − qi and Qi
ri−qi+1 = ai + bi. Further, let
δij = Qij −Qi
j−1 for j = 1, . . . , ri − qi + 1. Define
pij(w) = LC(Qij) +
LC(Qij+1)− LC(Qi
j)
δij+1
(w −Qij)
for j = 0, . . . , ri − qi. Then, the function
p(w) := min{pij(w)
∣∣∣ j ∈ {0, . . . , ri − qi}}
(6–21)
is a concave overestimator of LC(w) over [bi, ai + bi].
176
Theorem 6.1. Under Assumptions (A1), (A2), and (A3), a generalized lifted bilinear
cover inequality
∑j∈C
(aj + bj − µ)+xj +∑j∈T
ajyj +∑j∈I0
ajyj
+∑i∈M
αixi +∑i∈M
βiyi ≥∑j∈C
(aj + bj − µ)+ (6–22)
is facet-defining for PS if
(αi, βi) ∈
(LC(ai + bi), 0
)⋃ (
LC(bi), LC(ai + bi)− LC(bi)
)if qi = ri
⋃ri−qij=0
(LC(Qi
j)−LC(Qi
j+1)−LC(Qij)
δij+1(Qi
j − bi),LC(Qi
j+1)−LC(Qij)
δij+1ai
)if qi < ri
for i ∈ M in (6–22) where ri, Qij and δij+1 are as defined in Lemma 6.4.
Proof. By subadditivity of LC(w) for w ≥ 0, the lifting coefficients (αi, βi) of (xi, yi) for
i ∈ M will be valid if they satisfy the condition
αixi + βiyi ≥ LC(aixiyi + bixi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (6–23)
Condition (6–23) can be rewritten as:
βiφ ≥ LC(0) for 0 < φ ≤ 1, (6–24)
αi + βiφ ≥ LC(aiφ+ bi) for 0 ≤ φ ≤ 1. (6–25)
To prove that the lifted inequality (6–22) is facet-defining, we show that the proposed
coefficients (αi, βi) satisfy (6–24) and (6–25) and describe two points (xi, yi) for which
(6–23) is satisfied at equality. First, consider the case where (αi, βi) = (LC(ai + bi), 0).
Condition (6–24) is satisfied since βi = 0 and LC(0) = 0. Condition (6–25) also holds
because αi = LC(ai + bi) and LC(w) is non-decreasing. Further, (6–23) is satisfied at
equality at the two points, (0, φ) for some 0 < φ < 1 and (1, 1). Second, consider the case
where (αi, βi) = (LC(bi), LC(ai + bi)− LC(bi)) with qi = ri. Since LC(ai + bi) ≥ LC(bi) and
LC(w) is nondecreasing, we have that βi ≥ 0. Further, since LC(0) = 0, (6–24) is satisfied.
177
Using
LC(aiφ+ bi) ≤ LC(bi) +LC(ai + bi)− LC(bi)
ai(aiφ+ bi − bi) = αi + βiφ,
it is clear that (6–25) holds. It can be easily verified that (6–23) is satisfied at equality at
the two points (1, 0) and (1, 1). Finally, consider the case
(αi, βi) =
(LC(Qi
j)−LC(Qi
j+1)− LC(Qij)
δij+1
(Qij − bi),
LC(Qij+1)− LC(Qi
j)
δij+1
ai
)
for qi < ri. Clearly, (αi, βi) satisfies (6–24) since βi ≥ 0. From Lemma 6.4, we have that
LC(aiφ+ bi) ≤ LC(Qij) +
LC(Qij+1)−LC(Qi
j)
δij+1
(aiφ+ bi −Qi
j
)
=(LC(Qi
j)−LC(Qi
j+1)−LC(Qij)
δij+1(Qi
j − bi))+
LC(Qij+1)−LC(Qi
j)
δij+1aiφ
= αi + βiφ.
We can also verify that (6–25) is satisfied at equality at the two points(1,
Qij−bi
ai
)and
(1,
Qij+1−bi
ai
). Therefore, we conclude that (6–22) is facet-defining for PS.
Note that the family of generalized lifted bilinear cover inequalities (6–22) has an
exponential number of members similar to lifted bilinear cover inequalities (5–47). This is
illustrated in Example 6.2.
Example 6.2. In Example 6.1, we obtained that (6–19) is facet-defining for PS(M, ∅;M, ∅, ∅).Applying Theorem 6.1, we obtain the six inequalities
15x1
10x1 +458y1
+
14x2
5x2 +14y2
709x2 +56
9y2
+ 15y3 + 10x4 + 5x5 + 6y6 ≥ 15
which are all facet-defining for PS. We obtain from Figure 6-2 that there are two choices
for the lifting coefficients of (x1, y1). Similarly, we can determine that there are three
choices for (x2, y2).
178
6.2.2 Generalized Lifted Reverse Bilinear Cover Inequalities
In Section 6.2.1, we obtained a generalized lifted bilinear cover inequality by first
lifting the variables in C \ {l} and then lifting the remaining variables in M inside of
(6–11). In this section, we derive another family of lifted inequalities by changing the
lifting order. We lift (6–11) with respect to the variables (xj, yj) for j ∈ M first before
the variables (xj, yj) for j ∈ C \ {l}. Among the assumptions concerning the partition
(C,M, T ), (A2) can be changed into
(A2’) Al − µ >∑
j∈T bj +∑
j∈I1 aj for some l ∈ C where Al = al + bl.
This is less stringent than (A2) since l can be chosen to be any element in C
satisfying Al − µ >∑
j∈T bj +∑
j∈I1 aj and not only the largest one.
Proposition 6.6. Under Assumptions (A1), (A2’), and (A3),
(Al − µ)xl +∑j∈M
min{aj + bj,Al − µ}xj +∑j∈T
ajyj +∑j∈I0
ajyj ≥ Al − µ (6–26)
is facet-defining for PS(∅, C \ {l}; ∅, C \ {l}, ∅).
Proof. From Proposition 6.3, we know that
(Al − µ)xl +∑j∈T
ajyj +∑j∈I0
ajyj ≥ Al − µ
is facet-defining for PS(M,C \ {l};M,C \ {l}, ∅). Since LI(w) is subadditive over R+ as
shown in Lemma 6.3, lifting coefficients (αi, βi) of variables (xi, yi) for i ∈ M are valid if
they satisfy the condition:
αixi + βiyi ≥ LI(aixiyi + bixi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (6–27)
Condition (6–27) can be rewritten as:
βiφ ≥ LI(0) for 0 < φ ≤ 1, (6–28)
αi + βiφ ≥ LI(aiφ+ bi) for 0 ≤ φ ≤ 1. (6–29)
179
We now show that (αi, βi) = (min{ai + bi,Al − µ}, 0) are valid lifting coefficients for the
variables (xi, yi). Since LI(0) = 0, (6–28) is trivially satisfied as βi = 0. Further, since
LI(aiφ + bi) = min{aiφ + bi,Al − µ} ≤ min{ai + bi,Al − µ} = αi, (6–29) also holds.
To prove that (6–26) is facet-defining, consider the two points (1, 1) and (0, φ∗) for any
0 < φ∗ < 1. These points satisfy (6–27) at equality, and therefore, we conclude that (6–26)
is facet-defining for PS(∅, C \ {l}; ∅, C \ {l}, ∅).
To obtain facet-defining inequalities for PS, we next reintroduce the remaining
variables (xj, yj) for j ∈ C \ {l} in (6–26). To this end, we derive a closed form expression
for the function
LM(w) := max
{(Al − µ)(xl − 1) +
∑j∈M
min{aj + bj,Al − µ}xj +∑j∈T
ajyj +∑j∈I0
ajyj
}
s.t. alxlyl + blxl +∑
j∈M∪T(ajxjyj + bjxj) +
∑j∈I
ajyj ≥ Al − µ+ w
xj ∈ {0, 1} j ∈ {l} ∪ J \ C, yj ∈ [0, 1] j ∈ {l} ∪ J \ C ∪ I.
Let M = M1 ∪M2 where M1 = {i ∈ M | ai + bi > Al − µ} and M2 = M \M1. Assume
without loss of generality that {l} ∪M1 = {1, . . . , q} and a1 + b1 ≥ a2 + b2 ≥ . . . ≥ aq + bq
where q = |M1| + 1. Further, define E0 = 0 and Ei =∑i
j=1(aj + bj) for all i = 1, . . . , q.
Note that Al +∑
j∈M∪T (aj + bj) = Eq +∑
j∈M2(aj + bj) +
∑j∈T (aj + bj).
Proposition 6.7. LM(w) is given by:
LM(w) =
−Al + µ if w < −Al + µ
w − Ei + i(Al − µ) if Ei −Al + µ ≤ w < Ei,
i(Al − µ) if Ei ≤ w < Ei+1 −Al + µ,
w − Eq + q(Al − µ) if Eq −Al + µ ≤ w ≤ ∑j∈J(aj + bj) +
∑j∈I aj − d
∞ if∑
j∈J(aj + bj) +∑
j∈I aj − d < w,
for i = 0, . . . , q − 1 where µ = µ+∑
j∈T bj +∑
j∈I1 aj.
180
Proof. First, we observe that there exists an optimal solution (x∗, y∗) in which x∗j = 1 for
j ∈ T and y∗j = 1 for j ∈ {l} ∪ M ∪ I1 since the corresponding objective coefficients are
zero. Using the same notation a =∑
j∈T aj +∑
j∈I0 aj and y =∑
j∈T ajyj+∑
j∈I0ajyj
aas in the
proof of Proposition 5.19 as well as the definition of M1, we can simplify LM(w) as:
LM(w) = min
∑
j∈{l}∪M1
(Al − µ)xj +∑j∈M2
(aj + bj)xj + ay
− (Al − µ)
s.t.∑
i∈{l}∪M1
(aj + bj)xj +∑j∈M2
(aj + bj)xj + ay ≥ Al − µ+ w
xj ∈ {0, 1} j ∈ {l} ∪M1 ∪M2, y ∈ [0, 1].
Note that a =∑
j∈T aj +∑
j∈I0 aj > Al − µ ≥ ai + bi for all i ∈ M2 because of
Assumption (A3) and the definition of M2. Further, by introducing the additional notation
a =∑
j∈M2(aj + bj) + a and y =
∑j∈M2
(aj+bj)xj+ay
a, LM(w) can be written as:
LM(w) = min
∑
j∈{l}∪M1
(Al − µ)xj + ay
− (Al − µ)
s.t.∑
j∈{l}∪M1
(aj + bj)xj + ay ≥ Al − µ+ w
xj ∈ {0, 1} j ∈ {l} ∪M1, y ∈ [0, 1].
Since this function has the same structure as (5–57), its optimal value can be computed
using the same technique. The result then follows from the proof of Proposition 5.19.
It can be verified from Proposition 5.20 that LM(w) is superadditive.
Lemma 6.5. The function LM(w) is superadditive over [0,∑
j∈J(aj + bj) +∑
j∈I aj − d].
Therefore, we can use sequence-independent lifting technique for the remaining
variables (xi, yi) for i ∈ C \ {l}.Theorem 6.2. Under Assumptions (A1), (A2’), and (A3), the generalized lifted reverse
bilinear cover inequality
(Al − µ)xl +∑
i∈C\{l}LM(ai + bi)xi (6–30)
181
+∑j∈M
min{aj + bj,Al − µ}xj +∑j∈T
ajyj +∑j∈I0
ajyj ≥ Al − µ+∑
i∈C\{l}LM(ai + bi)
is facet-defining for PS.
Proof. Since LM(w) is superadditive over w ∈ [0,∑
j∈J(aj + bj) +∑
j∈I aj − d], valid lifting
coefficients (αi, βi) for variables (xi, yi) where i ∈ C \ {l} can be obtained if they satisfy
the condition
αi(1− xi) + βi(1− yi) ≤ LM((ai + bi)− (aixiyi + bixi)) for (xi, yi) ∈ {0, 1} × [0, 1] \ {1, 1}.(6–31)
Condition (6–31) can be rewritten as:
βi ≤ inf0≤φ<1
LM(ai − aiφ)
1− φ(6–32)
αi + sup0≤φ≤1
βi(1− φ) ≤ LM(ai + bi) (6–33)
Since PS is assumed to be full-dimensional, we obtain from Assumption 6.1 that ai + bi ≤∑
j∈J(aj + bj) +∑
j∈I aj − d for all i ∈ C. We next verify that (LM(ai + bi), 0) are
valid lifting coefficients for (xi, yi) where i ∈ C \ {l}. Clearly, βi = 0 satisfies (6–32)
since LM(w) ≥ 0 for w ≥ 0. Further, if αi = LM(ai + bi), then (6–33) is satisfied since
αi + sup0≤φ≤1 βi(1− φ) = αi = LM(ai + bi). Finally, since (6–31) is tight for the two points
(0, 0) and(1, (ai−A1+al−µ)+
ai
), we conclude that (6–30) is facet-defining for PS.
6.3 Preliminary Computational Study
We now perform a preliminary computational study to evaluate the strength of
lifted inequalities developed in Chapter 5 inside of a branch-and-cut algorithm. In
Section 6.3.1, we describe the testing environments including software and hardware
configurations. In Section 6.3.2, we describe testing instances on which we carry out the
empirical study. We then present implementation details about separation procedures and
performance measures in Section 6.3.3. We finally report numerical results that show our
lifted inequalities can help solve families of bilinear problems faster in Section 6.3.4.
182
6.3.1 Computational Environments
We implement a branch-and-cut algorithm using CPLEX [40] 11.1. CPLEX is
one of the most widely used commercial MIP solvers. It provides callable libraries that
allow users to customize cut generation inside of the branch-and-bound tree. All the
computational tests are carried out on the server iseunix.ise.ufl.edu that is running Redhat
Linux version 5 on Dell power edge 2600 with two Pentium4 3.2Ghz, 1M cache processors
and 6 gigabytes of memory.
6.3.2 Testing Instances
As a preliminary computational study, we evaluate how our lifted inequalities perform
on problems containing mostly bilinear covering constraints. In particular, we randomly
generate MINLP problems that minimize a linear objective subject to independent
bilinear covering constraints. These constraints are coupled with cardinality constraints.
As a result of their construction, these instances are sparse. More precisely, the testing
instances are formulated as
min∑i∈M
∑j∈N
fijxij +∑i∈M
∑j∈N
gijyij
s.t.∑j∈N
aijxijyij ≥ di, i ∈ M,
(B)∑i∈M
xij ≤ cj, j ∈ N,
xij ∈ {0, 1}, i ∈ M, j ∈ N,
yij ∈ [0, 1], i ∈ M, j ∈ N,
where M := {1, . . . ,m} and N := {1, . . . , n}. The parameters aij for all i ∈ M and j ∈ N
are nonnegative integers randomly generated from uniform distributions. Parameters di for
i ∈ M are created by multiplying∑
j∈N aij by a random number from the interval [l, u]
where 0 < l < u < 1 as presented in Table 6-1. Parameters cj for j ∈ N are similarly
chosen as multiples of |M | with random numbers in the interval [l, u]. Infeasible instances
183
are discarded. The coefficients fij and gij of the objective are chosen to be proportional to
aij.
We create three sets of instances by changing the size of the sets M and N as well as
the ranges of the parameters. For each parameter setting, we generate 10 instances. We
specify the parameters used in the generation of random instances in Table 6-1.
Table 6-1. Parameters of the random instances for three test sets
TestSetID Size of sets Range of the parameters
|M | |N | aij di/∑
j∈N aij cj/|M | fij/aij gij/aij
I-20-10 20 10 [5, 100] [0.3, 0.99] [0.3, 0.7] [0.5, 3.0] [0.7, 1.3]
I-50-15 50 15 [10, 70] [0.3, 0.99] [0.3, 0.75] [0.5, 3.0] [0.7, 1.3]
I-100-20 100 20 [10, 70] [0.4, 0.95] [0.3, 0.8] [0.5, 3.0] [0.7, 1.3]
For these instances, we compare the results obtained by first linearizing the bilinear
constraints and then solving the problem with CPLEX using default MIP cuts with the
results obtained by adding our cuts to the linearization and then calling CPLEX. In
particular, the linearization of problem (B) is derived by introducing auxiliary variables
wij to represent the products xijyij. The resulting linearization is:
min∑i∈M
∑j∈N
fijxij +∑i∈M
∑j∈N
gijyij
s.t.∑j∈N
aijwij ≥ di, i ∈ M,
∑i∈M
xij ≤ cj, j ∈ N,
xij − wij ≥ 0, i ∈ M, j ∈ N,
(LB) yij − wij ≥ 0, i ∈ M, j ∈ N,
xij + yij − wij ≤ 1, i ∈ M, j ∈ N,
xij ∈ {0, 1}, i ∈ M, j ∈ N,
yij ∈ [0, 1], i ∈ M, j ∈ N,
wij ≥ 0, i ∈ M, j ∈ N.
184
Table 6-2. Characteristics of the three test sets
TestSetID NVar NConst NZ Obj. Value
Bin Cont Bil Card MILP LP Gap(%)
I-20-10 200 400 20 10 1800 12242.44 11777.84 3.77
I-50-15 800 1600 50 15 6750 46823.60 46188.26 1.35
I-100-20 2000 4000 100 20 18000 140230.71 139019.64 0.86
In Table 6-2, we summarize the characteristics of (LB). Columns Bin and Cont
contain the number of binary and continuous variables in (LB), while the columns Bil
and Card give the number of bilinear and cardinality constraints respectively. Column NZ
shows the number of nonzero elements in the formulation. Column MILP presents the
average of their IP optimal values over the 10 instances corresponding to each parameter
setting. Column LP gives the average optimal values of the LP relaxations of (LB). The
gap between these two values is presented in Column Gap where
Gap(%) =MILP − LP
MILP∗ 100.
This measure describes how close the LP relaxation is to the optimal solution. The
optimal values for each individual instance are presented in Table 6-3.
6.3.3 Separation Procedures
As a preliminary testing, we implement a cut-and-branch algorithm that adds cuts
only at the root node. Among three families of lifted inequalities that we developed in
Chapter 5, only lifted bilinear covering inequalities (5–47) are used as cuts. Now we
describe separation procedures for the lifted bilinear cover inequalities (5–47). We first
observe that, in bilinear cover inequalities
∑j∈C
(aj − µ)+xj +∑j∈T
ajyj +∑i∈M
(αixi + βiyi) ≥∑j∈C
(aj − µ)+, (6–34)
185
Table 6-3. Objective values to the test instances
Instance ID MILP LP Gap(%)
I-20-10-0 12110.9195 11640.4189 3.8849
I-20-10-1 10727.5900 10434.0135 2.7366
I-20-10-2 11443.7508 11040.3807 3.5248
I-20-10-3 11714.3205 11471.0029 2.0771
I-20-10-4 12987.9883 12370.2148 4.7565
I-20-10-5 15076.8320 14610.3839 3.0938
I-20-10-6 10039.8013 9613.3069 4.2480
I-20-10-7 11516.0063 11114.4414 3.4870
I-20-10-8 15278.2189 14510.5942 5.0243
I-20-10-9 11529.0013 10973.6524 4.8170
I-50-15-0 49114.5050 48345.9494 1.5648
I-50-15-1 43482.4190 42981.5691 1.1518
I-50-15-2 43201.3765 42661.1454 1.2505
I-50-15-3 44608.6242 44041.2446 1.2719
I-50-15-4 47805.6008 47088.4489 1.5001
I-50-15-5 46348.5055 45715.3331 1.3661
I-50-15-6 46776.2481 45870.5232 1.9363
I-50-15-7 50821.3106 50206.1345 1.2105
I-50-15-8 49815.5893 49087.5678 1.4614
I-50-15-9 46261.8475 45884.6801 0.8153
I-100-20-0 137653.9675 136534.7575 0.8131
I-100-20-1 134121.9302 133153.2319 0.7223
I-100-20-2 139677.9045 138565.7847 0.7962
I-100-20-3 130580.6226 129397.9127 0.9057
I-100-20-4 136384.8079 135383.4821 0.7342
I-100-20-5 144012.8562 142557.8922 1.0103
I-100-20-6 138801.7630 137594.3403 0.8699
I-100-20-7 148099.1275 146786.2270 0.8865
I-100-20-8 148704.6629 147200.3903 1.0116
I-100-20-9 144269.4161 143022.4053 0.8644
186
the coefficients (αi, βi) are very dependent on the partition (C,M, T ). Therefore, it is
difficult to express the problem of finding a most violated lifted bilinear cover inequality as
a simple optimization problem.
We observe however that separating the inequality
∑j∈C
(aj − µ)+xj +∑j∈T
ajyj ≥∑j∈C
(aj − µ)+ (6–35)
which can be lifted to (6–34) is easier. In particular, we can write the separation problem
for (6–35) as
min∑j∈N
(aj − µ)+(x∗j − 1)ξj +
∑j∈N
ajy∗j ζj
s.t.∑j∈N
ajξj = d+ µ, (6–36)
ai ≥ µξi + ε, ∀i ∈ N, (6–37)
∑j∈N
ajζj ≥ aiξi − µ+ ε, ∀i ∈ N,
ξj + ζj ≤ 1, ∀j ∈ N,
ξj, ζj ∈ {0, 1}, ∀j ∈ N,
where ε > 0 is sufficiently small constant and ξj are defined as
ξj =
1 if j ∈ C for j ∈ N
0 otherwise,
and
ζj =
1 if j ∈ T for j ∈ N
0 otherwise.
This optimization problem is similar to a 0−1 knapsack problem. Although the 0−1
knapsack problem is NP-hard, it can often be solved efficiently with dynamic programming
(DP) techniques. However, for separation purposes, using a DP is usually out of the
187
question since data in MILPs is often fractional and leads to large DP tables. Therefore,
we will solve (6–37) using a heuristic approach, as is common in MIP.
Here, we adapt the coefficient independent cover generation scheme proposed by Gu
et al. [60] since it is known to be computationally efficient in the practical separation
of 0−1 cover inequalities. The basic idea is to select first the variables that have the
lowest LP values to include in the cover C. Assume that a fractional solution x∗ to
the LP relaxation satisfies x∗l1
≤ . . . ≤ x∗ln. Then, a violated cover C is obtained as
C := {l1, . . . , lc} where c = argmin{k | ∑kj=1 alj > d}.
After an initial cover C has been obtained using the heuristic described above, we
check if C satisfies Assumption (A2) in Section 5.3.2. If this does not hold, then we swap
an element in the cover with the one chosen from N \ C in the order of the lowest LP
values until (A2) is satisfied. We next determine the sets T and M . To this end, we
compute all the possible coefficients (αi, βi) of a variable pair (xi, yi) using the results of
Theorem 5.1. When there are many choices for (αi, βi), we select the values (α∗i , β
∗i ) that
lead to the largest violation for the current LP solution, i.e.,
α∗ix
∗i+β∗
i y∗i = min
αix
∗i + βiy
∗i
∣∣∣∣∣∣∣
(αi, βi) ∈(PC(ai), 0
)⋃qi
j=1
(PC(Qi
j)−PC(Qi
j+1)−PC(Qij)
aj+1Qi
j,PC(Qi
j+1)−PC(Qij)
aj+1ai
)
.
Then, we partition N \ C into (M,T ) using a simple comparison rule similar to that used
in Gu et al. [61], i.e.,
T ={i ∈ N \ C
∣∣∣ aiy∗i ≤ (α∗ix
∗i + β∗
i y∗i )}.
After obtaining an initial partition (C,M, T ) with this procedure, we check if
(C,M, T ) satisfies Assumption (A3) in Section 5.3.2. If not, we select an element from
the set M and add it to the set T until (A3) is satisfied.
188
6.3.4 Numerical Results
In this section, we present computational results obtained using lifted bilinear cover
inequalities (5–47) inside of a cut-and-branch algorithm. We compare the performances of
lifted cuts on three families of instances described in Table 6-1. In our implementation, a
single round of cuts is added at the root node.
In Tables 6-4, 6-5, and 6-6, we present preliminary results obtained on our randomly
generated instances. Columns MILP and Nodes show the optimal values and the number
of nodes in the tree when solving MILP problem using CPLEX with the default setting.
Column LP shows the optimal values of the LP relaxation at the root node. Column Cuts
refers to the number of cuts added at the root node. Columns LPCuts and CNodes refer
to the optimal LP values and the number of nodes in the tree after adding cuts to the
formulation. Column Gap Imp. computed as
Gap Imp.(%) =LPCuts− LP
MILP − LP∗ 100
presents how much added cuts help improve the bound and Column Node Red. computed
as
Node Red.(%) =Nodes− CNodes
Nodes∗ 100
describes how many nodes in the tree are decreased after adding cuts. We observe that
lifted cuts typically help improve the bounds of the LP relaxation at the root node
although the gap improvement is modest on some instances. Further, we observe that the
average number of nodes in the tree is reduced when adding lifted cuts.
6.4 Concluding Remarks
In this section, we first generalized the lifted inequalities obtained in Chapter 5 to
general 0−1 bilinear covering sets that have additional linear terms. We then evaluate
the practical impacts of the lifted inequalities for 0−1 mixed-integer bilinear covering
sets inside of a cut-and-branch framework. To this end, we described heuristic separation
189
Tab
le6-4.
Perform
ance
oflifted
cuts
onsm
allsize
instan
ces
Instan
ceID
MILP
Nodes
LP
Cuts
LPCuts
CNodes
Gap
Imp.(%
)NodeRed.(%)
I-20-10-0
12110.92
430
11640.42
1211651.72
353
2.4012
17.9070
I-20-10-1
10727.59
010434.01
1410437.80
01.2904
N/A
I-20-10-2
11443.75
233
11040.38
1411055.03
913.6329
60.9442
I-20-10-3
11714.32
2311471.00
1111471.00
370.0000
-60.8696
I-20-10-4
12987.99
573
12370.21
1212370.21
531
0.0000
7.3298
I-20-10-5
15076.83
576
14610.38
1514633.53
505
4.9616
12.3264
I-20-10-6
10039.80
530
9613.31
169613.56
506
0.0598
4.5283
I-20-10-7
11516.01
6211114.44
1511132.76
414.5626
33.8710
I-20-10-8
15278.22
278
14510.59
1314510.59
291
0.0000
-4.6763
I-20-10-9
11529.00
530
10973.65
1210973.65
499
0.0000
5.8491
Average
12242.44
323.5
11777.84
13.4
11784.99
285.4
1.6908
8.5789
190
Tab
le6-5.
Perform
ance
oflifted
cuts
onmedium
size
instan
ces
Instan
ceID
MILP
Nodes
LP
Cuts
LPCuts
CNodes
Gap
Imp.(%
)NodeRed.(%)
I-50-15-0
49114.51
535
48345.95
3248345.95
527
0.0000
1.4953
I-50-15-1
43482.42
710
42981.57
3142981.57
526
0.0000
25.9155
I-50-15-2
43201.38
535
42661.15
3142661.15
526
0.0000
1.6822
I-50-15-3
44608.62
535
44041.24
3744041.24
532
0.0000
0.5607
I-50-15-4
47805.60
545
47088.45
2747088.45
531
0.0000
2.5688
I-50-15-5
46348.51
542
45715.33
3045715.33
528
0.0000
2.5830
I-50-15-6
46776.25
535
45870.52
3245870.52
527
0.0000
1.4953
I-50-15-7
50821.31
535
50206.13
2950206.13
524
0.0000
2.0561
I-50-15-8
49815.59
535
49087.57
3449087.57
529
0.0000
1.1215
I-50-15-9
46261.85
535
45884.68
2945884.68
524
0.0000
2.0561
Average
46823.60
554.2
46188.26
31.2
46188.26
527.4
0.0000
4.1535
191
Tab
le6-6.
Perform
ance
oflifted
cuts
onlargesize
instan
ces
Instan
ceID
MILP
Nodes
LP
Cuts
LPCuts
CNodes
Gap
Imp.(%
)NodeRed.(%)
I-100-20-0
137653.97
772
136534.76
58136534.76
561
0.0000
27.3316
I-100-20-1
134121.93
568
133153.23
67133153.23
510
0.0000
10.2113
I-100-20-2
139677.90
532
138565.78
56138565.78
505
0.0000
5.0752
I-100-20-3
130580.62
836
129397.91
62129397.91
499
0.0000
40.3110
I-100-20-4
136384.81
867
135383.48
66135383.48
572
0.0000
34.0254
I-100-20-5
144012.86
455
142557.89
61142557.89
476
0.0000
-4.6154
I-100-20-6
138801.76
855
137594.34
64137594.34
712
0.0000
16.7251
I-100-20-7
148099.13
511
146786.23
70146787.90
578
0.1272
-13.1115
I-100-20-8
148704.66
525
147200.39
61147200.39
546
0.0000
-4.0000
I-100-20-9
144269.42
455
143022.41
60143022.41
494
0.0000
-8.5714
Average
140230.71
637.6
139019.64
62.5
139019.81
545.3
0.0127
10.3381
192
procedures for the inequalities that we developed and presented computational results
using these procedures.
193
CHAPTER 7CONCLUSIONS AND FUTURE RESEARCH
In this thesis, we study two tools to improve current convexification methods
in MINLP. We develop a convexification tool that characterizes the convex hulls of
a nonlinear set whose convex hulls are completely determined by their orthogonal
disjunctions. In particular, we apply this tool to obtain the convex hulls of various bilinear
covering sets that appear as relaxations of MINLP problems. To handle the bounds on
variables, we study how lifting techniques can be used to derive strong valid inequalities
in 0−1 mixed-integer bilinear covering sets. Finally, we perform a computational study to
show that our lifted inequalities have the potential to improve the performance of solution
methods for MINLP problems. In Section 7.1, we summarize our research contributions
and their practical impact in solving MINLPs. In Section 7.2, we describe possible avenues
of future research.
7.1 Summary of Contributions
We summarize the major contributions of this thesis into three parts.
First, we derive a closed-form description for the convex hulls of nonlinear sets
whose convex hulls are completely determined by their restrictions over orthogonal
subspaces. While our convexification tool was developed using disjunctive programming
and convex extensions, it differs from prior approaches in that it does not introduce
auxiliary variables. We provide a toolbox of results to verify the technical assumptions
under which this convexification tool can be used. We also apply this tool for deriving the
split cut for mixed-integer programs. We then develop a fundamental result that extends
the applicability of the convexification tool to relaxing nonconvex constraints by providing
sufficient conditions for establishing the convex extension property. We illustrate how this
result can be used to derive the convex hull of a continuous bilinear covering set over the
non-negative orthant.
194
Second, we study 0−1 mixed-integer bilinear covering sets where the variables have
upper bounds. We show that these sets are polyhedral and provide characterizations of
their trivial facets. We then obtain a complete linear description of the convex hull of
these sets when they are defined by only two pairs of variables. Next, we derive three
families of facet-defining inequalities via sequence-independent lifting techniques. These
families are developed using the concept of a cover which is common in the integer
programming literature and selecting different lifting orders. We then show that 0−1
mixed-integer bilinear covering sets have polyhedral structures that are similar to those of
certain single-node flow sets. As a result, we obtain new facet-defining inequalities for flow
sets that generalize classical lifted flow cover inequalities.
Third, we consider the use of the lifted inequalities we derive for 0−1 mixed-integer
bilinear covering sets in a cut-and-branch algorithm. First, we generalize our lifted
inequalities to be valid for bilinear covering sets with additional linear terms. We describe
separation procedures for lifted bilinear cover inequalities and provide computational
results on randomly generated instances.
7.2 Future Research
We conclude this thesis by presenting some potential directions for future research.
1. Extension of orthogonal disjunctions theory to other classes of problems: We applied
our convexification tool to bilinear covering sets without upper bounds on variables.
However, the applicability of our tool can be extended to more general problems
such as polynomial covering sets [117]. We could also investigate how to handle
the bounds of variables in convexification procedures using orthogonal disjunctions.
In addition, since orthogonal disjunctions are closely related to complementarity
constraints, we plan to apply our convexification tools to obtain strong convex
relaxations of mathematical programs with complementarity constraints (MPCC).
2. Application of lifting tool to general nonlinear problems: In Chapter 5, we studied
0−1 mixed-integer bilinear covering sets and derived facet-defining inequalities using
195
lifting. Since lifting can be applied to problems with integer variables, a possible
direction is to generalize our lifted inequalities for bilinear covering sets with 0−1
variables to bilinear covering sets with general integer variables. Further, while we
consider a single constraint in our lifting procedures, we could also consider multiple
constraints simultaneously to obtain strong valid inequalities [139]. For example, for
the mixed-integer set defined by a bilinear equality constraint, i.e.,
BM =
{(x, y) ∈ {0, 1}n × [0, 1]n
∣∣∣n∑
j=1
ajxjyj = d
},
we could combine the lifting results of Chapter 5 with those of [100].
3. Computational study of lifted cuts in branch-and-cut framework: We performed a
preliminary computational study on randomly generated instances in Chapter 6.
A natural extension of our work is to evaluate the performance of other families of
lifted inequalities and to perform an extensive empirical study of the use of lifted
inequalities on real instances. Further, since our lifted inequalities are generalizations
of classical lifted flow cover inequalities for single-node flow sets, we could also
evaluate whether our cuts can help improve the performance of the branch-and-cut
algorithm on MIPLIB instances [2].
196
APPENDIX ALINEAR DESCRIPTION OF THE CONVEX HULL OF A BILINEAR SET
The linear description of conv(B) is obtained by PORTA as the following:
B ={(x, y) ∈ {0, 1}4 × [0, 1]4
∣∣∣ 19x1y1 + 17x2y2 + 15x3y3 + 10x4y4 ≥ 20}
(1) 50x1 +90x3 +45x4 +76y1 +153y2 ≥ 135
(2) 70x1 +90x2 +27x4 +38y1 +135y3 ≥ 117
(3) 25x1 +65x3 +45x4 +76y1 +153y2 ≥ 110
(4) +50x2 +70x3 +35x4 +133y1 +34y2 ≥ 105
(5) +25x2 +45x3 +35x4 +133y1 +34y2 ≥ 80
(6) 21x1 +41x2 +27x4 +38y1 +135y3 ≥ 68
(7) 30x1 +35x2 +21x3 +19y1 +70y4 ≥ 56
(8) 18x1 +23x2 +21x3 +19y1 +70y4 ≥ 44
(9) 19x1 +17x2 +15y3 +10y4 ≥ 20
(10) 19x1 +15x3 +17y2 +10y4 ≥ 20
(11) 19x1 +10x4 +17y2 +15y3 ≥ 20
(12) 19x1 +17y2 +15y3 +10y4 ≥ 20
(13) +17x2 +15x3 +19y1 +10y4 ≥ 20
(14) +17x2 +10x4 +19y1 +15y3 ≥ 20
(15) +17x2 +19y1 +15y3 +10y4 ≥ 20
(16) +15x3 +10x4 +19y1 +17y2 ≥ 20
(17) +15x3 +19y1 +17y2 +10y4 ≥ 20
(18) +10x4 +19y1 +17y2 +15y3 ≥ 20
(19) +19y1 +17y2 +15y3 +10y4 ≥ 20
(20) 14x1 +10x3 +5x4 +17y2 ≥ 15
(21) +12x2 +10x3 +5x4 +19y1 ≥ 15
(22) +10x3 +5x4 +19y1 +17y2 ≥ 15
197
(23) 12x1 +10x2 +3x4 +15y3 ≥ 13
(24) +10x2 +10x3 +3x4 +19y1 ≥ 13
(25) +10x2 +3x4 +19y1 +15y3 ≥ 13
(26) 10x1 +10x2 +x4 +15y3 ≥ 11
(27) 10x1 +10x3 +x4 +17y2 ≥ 11
(28) 10x1 +x4 +17y2 +15y3 ≥ 11
(29) 7x1 +5x2 +3x3 +10y4 ≥ 8
(30) +5x2 +3x3 +19y1 +10y4 ≥ 8
(31) +5x2 +3x3 +5x4 +19y1 ≥ 8
(32) +3x2 +3x3 +3x4 +19y1 ≥ 6
(33) 5x1 +x3 +17y2 +10y4 ≥ 6
(34) 5x1 +5x2 +x3 +10y4 ≥ 6
(35) 5x1 +x3 +5x4 +17y2 ≥ 6
(36) 3x1 +x2 +15y3 +10y4 ≥ 4
(37) 3x1 +x2 +3x3 +10y4 ≥ 4
(38) 3x1 +x2 +3x4 +15y3 ≥ 4
(39) x1 +x2 +x3 +10y4 ≥ 2
(40) x1 +x2 +x4 +15y3 ≥ 2
(41) x1 +x3 +x4 +17y2 ≥ 2
(42) x1 +x2 +x3 +x4 ≥ 2
(43) x1 ≥ 0
(44) x2 ≥ 0
(45) x3 ≥ 0
(46) x4 ≥ 0
(47) y1 ≥ 0
(48) y2 ≥ 0
(49) y3 ≥ 0
198
(50) y4 ≥ 0
(51) y4 ≤ 1
(52) y3 ≤ 1
(53) y2 ≤ 1
(54) y1 ≤ 1
(55) x4 ≤ 1
(56) x3 ≤ 1
(57) x2 ≤ 1
(58) x1 ≤ 1
199
APPENDIX BLINEAR DESCRIPTION OF THE CONVEX HULL OF A FLOW SET
The linear description of conv(F ) is obtained by PORTA as the following:
F ={(x, y) ∈ {0, 1}4 × [0, 1]4
∣∣∣ 19y1 + 17y2 + 15y3 + 10y4 ≥ 20, xj ≥ yj ∀j = 1, . . . , 4}
(1) 50x1 +90x3 +45x4 +76y1 +153y2 ≥ 135
(2) 70x1 +90x2 +27x4 +38y1 +135y3 ≥ 117
(3) 25x1 +65x3 +45x4 +76y1 +153y2 ≥ 110
(4) +50x2 +70x3 +35x4 +133y1 +34y2 ≥ 105
(5) +25x2 +45x3 +35x4 +133y1 +34y2 ≥ 80
(6) 21x1 +41x2 +27x4 +38y1 +135y3 ≥ 68
(7) 30x1 +35x2 +21x3 +19y1 +70y4 ≥ 56
(8) 18x1 +23x2 +21x3 +19y1 +70y4 ≥ 44
(19) +19y1 +17y2 +15y3 +10y4 ≥ 20
(22) +10x3 +5x4 +19y1 +17y2 ≥ 15
(24) +10x2 +10x3 +3x4 +19y1 ≥ 13
(25) +10x2 +3x4 +19y1 +15y3 ≥ 13
(26) 10x1 +10x2 +x4 +15y3 ≥ 11
(27) 10x1 +10x3 +x4 +17y2 ≥ 11
(28) 10x1 +x4 +17y2 +15y3 ≥ 11
(30) +5x2 +3x3 +19y1 +10y4 ≥ 8
(31) +5x2 +3x3 +5x4 +19y1 ≥ 8
(32) +3x2 +3x3 +3x4 +19y1 ≥ 6
(33) 5x1 +x3 +17y2 +10y4 ≥ 6
(34) 5x1 +5x2 +x3 +10y4 ≥ 6
(35) 5x1 +x3 +5x4 +17y2 ≥ 6
(36) 3x1 +x2 +15y3 +10y4 ≥ 4
200
(37) 3x1 +x2 +3x3 +10y4 ≥ 4
(38) 3x1 +x2 +3x4 +15y3 ≥ 4
(39) x1 +x2 +x3 +10y4 ≥ 2
(40) x1 +x2 +x4 +15y3 ≥ 2
(41) x1 +x3 +x4 +17y2 ≥ 2
(42) x1 +x2 +x3 +x4 ≥ 2
(47) y1 ≥ 0
(48) y2 ≥ 0
(49) y3 ≥ 0
(50) y4 ≥ 0
(55) x4 ≤ 1
(56) x3 ≤ 1
(57) x2 ≤ 1
(58) x1 ≤ 1
(f1) x1 −y1 ≥ 0
(f2) x2 −y2 ≥ 0
(f3) x3 −y3 ≥ 0
(f4) x4 −y4 ≥ 0
201
REFERENCES
[1] Achterberg, T., T. Koch, T. Martin. 2005. Branching rules revisited. OperationsResearch Letters 33 42–54.
[2] Achterberg, Tobias, Thorsten Koch, Alexander Martin. 2006. MIPLIB 2003.Operations Research Letters 34 361–372.
[3] Adjiman, C. S., I. P. Androulakis, C. A. Floudas. 1998. A global optimizationmethod, αBB, for general twice-differentiable constrained NLPs–II. Implementationand computational results. Computers and Chemical Engineering 22 1159–1179.
[4] Al-Khayyal, F. A., J. E. Falk. 1983. Jointly constrained biconvex programming.Mathematics of Operations Research 8 273–286.
[5] Androulakis, I. P., C. D. Maranas, C. A. Floudas. 1995. αBB: A global optimizationmethod for general constrained nonconvex problems. Journal of Global Optimization7 337–363.
[6] Atamturk, A. 2001. Flow pack facets of the single node fixed-charge flow polytope.Operations Research Letters 29 107–114.
[7] Atamturk, A. 2003. On the facets of the mixed-integer knapsack polyhedron.Mathematical Programming 98 145–175.
[8] Atamturk, A. 2004. Sequence independent lifting for mixed-integer programming.Operations Research 52 487–490.
[9] Atamturk, A. 2006. Strong formulations of robust mixed 0−1 programming.Mathematical Programming 108 235–250.
[10] Atamturk, A., V. Narayanan. 2007. Lifting for conic mixed-integer programming.Research Report BCOL.07.04, IEOR, University of California-Berkeley. Forthcomingin Mathematical Programming.
[11] Atamturk, A., D. Rajan. 2002. On splittable and unsplittable capacitated networkdesign arc-set polyhedra. Mathematical Programming 92 315–333.
[12] Balas, E. 1971. Intersection cuts - a new type of cutting planes for integerprogramming. Operations Research 19 19–39.
[13] Balas, E. 1975. Disjunctive programming: Cutting planes from logical conditions.O. L. Mangasarian, R. R. Meyer, S. M. Robinson, eds., Nonlinear Programming .Academic Press, NY, 279–312.
[14] Balas, E. 1975. Facets of the knapsack polytope. Mathematical Programming 8146–164.
[15] Balas, E. 1979. Disjunctive programming. Annals of Discrete Mathematics 5 3–51.
202
[16] Balas, E. 1985. Disjunctive programming and a hierarchy of relaxations for discreteoptimization problems. SIAM Journal on Algebraic and Discrete Methods 6 466–486.
[17] Balas, E. 1998. Disjunctive programming: Properties of the convex hull of feasiblepoints. Discrete Applied Mathematics 89(1-3) 3–44. Original manuscript waspublished as a technical report in 1974.
[18] Balas, E. 2005. Projection, lifting and extended formulation in integer andcombinatorial optimization. Annals of Operations Research 140 125–161.
[19] Balas, E., A. Bockmayr, N. Pisaruk, L. Wolsey. 2004. On unions and dominants ofpolytopes. Mathematical Programming 99 223–239.
[20] Balas, E., S. Ceria, G. Cornuejols. 1993. A lift-and-project cutting plane algorithmfor mixed 0−1 programs. Mathematical Programming 58 295–324.
[21] Balas, E., S. Ceria, G. Cornuejols. 1996. Mixed 0−1 programming by lift-and-projectin a branch-and-cut framework. Management Science 42 1229–1246.
[22] Balas, E., M. Perregaard. 2003. A precise correspondence between lift-and-projectcuts, simple disjunctive cuts, and mixed-integer gomory cuts for 0−1 programming.Mathematical Programming 94 221–245.
[23] Balas, E., E. Zemel. 1978. Facets of the knapsack polytope from minimal covers.SIAM Journal on Applied Mathematics 34 119–148.
[24] Bazaraa, M. S., H. D. Sherali, C. M. Shetty. 2006. Nonlinear Programming: Theoryand Algorithms . 3rd ed. John Wiley & Sons, New York, NY.
[25] Belotti, P. 2009. Design of telecommunication networks with shared protection.Available at http://www.minlp.org/library/problem/index.php?i=51.
[26] Belotti, P., J. Lee, L. Liberti, F. Margot, A. Wachter. 2009. Branching and boundstightening techniques for non-convex MINLP. Optimization Methods and Software24 597–634. Available at http://www.optimization-online.org/DB_HTML/2008/08/2059.html.
[27] Biegler, L. T., I. E. Grossmann, A. W. Westerberg. 1997. Systematic Methods ofChemical Process Design. Prentice Hall, Uppder Saddle River (NJ).
[28] Bland, R. G. 1977. New finite pivoting rule for the simplex method. Mathematics ofOperations Research 2 103–107.
[29] Boyd, S., L. Vandenberghe. 2004. Convex Optimization. Cambridge UniversityPress, Cambridge.
[30] Ceria, S., G. Cordier, H. Marchand, L. A. Wolsey. 1999. Cutting planes for integerprograms with general integer variables. Mathematical Programming 81 201–214.
203
[31] Ceria, S., J. Soares. 1999. Convex programming for disjunctive convex optimization.Mathematical Programming 86A 595–614.
[32] Christof, T., A. Lobel. 1997. PORTA: POlyhedron Representation TransformationAlgorithm. Available at http://www.zib.de/Optimization/Software/Porta/.
[33] Chung, K., J.-P. P. Richard, M. Tawarmalani. 2010. A computational study for liftedinequalities for 0−1 mixed-integer bilinear covering sets. Working paper.
[34] Chung, K., J.-P. P. Richard, M. Tawarmalani. 2010. Lifted inequalities for 0−1mixed-integer bilinear covering sets. Working paper.
[35] Cook, S. A. 1971. The complexity of theorem-proving procedures. Proceedings ofthe Third Annual ACM Symposium on the Theory of Computing . ACM, New York,151–158.
[36] Cook, W., R. Kannan, A. Schrijver. 1990. Chvatal closures for mixed integerprogramming problems. Mathematical Programming 47 155–174.
[37] Cornuejols, G. 2008. Valid inequalities for mixed integer linear programs. Mathemat-ical Programming 112 3–44.
[38] Cornuejols, G., C. Lemarechal. 2006. A convex-analysis perspective on disjunctivecuts. Mathematical Programming 106 567–586.
[39] Cornuejols, G., R. Tutuncu. 2006. Optimization Methods in Finance. CambridgeUniversity Press.
[40] CPLEX. 2007. CPLEX 11.1 User’s Manual . ILOG Inc., Mountain View, CA.
[41] Crama, Y. 1993. Concave extensions for nonlinear 0−1 maximization problems.Mathematical Programming 61 53–60.
[42] Crowder, H. P., E. L. Johnson, M. W. Padberg. 1983. Solving large-scale zero-onelinear programming problems. Operations Research 31 803–834.
[43] Dakin, R. J. 1965. A tree search algorithm for mixed-integer programming problems.Computer Journal 8 250–255.
[44] Danzig, G. B. 1951. Maximization of a linear function of variables subject to linearinequalities. T.C. Koopmans, ed., Activity Analysis of Production and Allocation.Wiley N.Y., 339–347.
[45] Danzig, G. B., R. Fulkerson, S. Johnson. 1954. Solution of a large-scale travelingsalesman problem. Operations Research 2 393–410.
[46] de Farias, I. R., E. L. Johnson, G. L. Nemhauser. 2002. Facets of thecomplementarity knapsack polytope. Mathematics of Operations Research 27210–226.
204
[47] Driebeek, N. J. 1965. An algorithm for the solution of mixed-integer programmingproblems. Management Science 12 576–587.
[48] Duran, M. A., I. E. Grossmann. 1986. An outer-approximation algorithm for a classof mixed-integer nonlinear programs. Mathematical Programming 36 307–339.
[49] Falk, J. E. 1969. Lagrange multipliers and nonconvex programs. SIAM Journal onControl 7 534–545.
[50] Falk, J. E., K. L. Hoffman. 1976. A successive underestimation methods for concaveminimization problems. Mathematics of Operations Research 1 251–259.
[51] Falk, J. E., R. M. Soland. 1969. An algorithm for separable nonconvex programmingproblems. Management Science 15 550–569.
[52] Fiacco, A. V., G. P. McCormick. 1990. Nonlinear Programming. Sequential Uncon-strained Minimization Techniques . Soiety for Industrial and Applied Mathematics.First published in 1968 by Research Analysis Corporation.
[53] Floudas, C. A. 2001. Global optimziation in design and control of chemical processsystems. Journal of Process Control 10 125–134.
[54] Fourier, J. B. J. 1826. Solution d’une question particuliere du calcul des inegalites.Nouveau Bulletin des Sciences par la Societe Philomatique de Paris 317–319.
[55] Fukuda, K., T. M. Liebling, C. Lutolf. 2001. Extended convex hull. Computationalgeometry 20 13–23.
[56] Garey, M. R., D. S. Johnson. 1979. Computers and Intractability: A Guide to theTheory of NP-Completeness . W.H. Freeman.
[57] Gomory, R. E. 1958. Outline of an algorithm for integer solutions to linear programs.Bulletin of the American Mathematical Society 64 275–278.
[58] Gomory, R. E. 1969. Some polyhedra related to combinatorial problems. LinearAlgebra and Its Applications 2 451–558.
[59] Grotschel, M., L. Lovasz, A. Schrijver. 1988. Geometric Algorithms and Combinato-rial Optimization. Springer-Verlag, Berlin, Germany.
[60] Gu, Z., G. L. Nemhauser, M. W. P. Savelsbergh. 1998. Lifted cover inequalities for0−1 integer programs: Computation. INFORMS Journal on Computing 10 427–437.
[61] Gu, Z., G. L. Nemhauser, M. W. P. Savelsbergh. 1999. Lifted flow cover inequalitiesfor mixed 0−1 integer programs. Mathematical Programming 85 439–467.
[62] Gu, Z., G. L. Nemhauser, M. W. P. Savelsbergh. 2000. Sequence independent liftingin mixed integer programming. Journal of Combinatorial Optimization 4 109–129.
205
[63] Hammer, P. L., E. L. Johnson, U. N. Peled. 1975. Facets of regular 0−1 polytopes.Mathematical Programming 8 179–206.
[64] Harjunkoski, I., T. Westerlund, R. Porn, H. Skrifvars. 1998. Differenttransformations for solving non-convex trim-loss problems by MINLP. EuropeanJournal of Operational Research 105 594–603.
[65] Hoffman, K. 1981. A method for globally minimizing concave functions over convexsets. Mathematical Programming 20 22–32.
[66] Horst, R. 1976. An algorithm for nonconvex programming problems. MathematicalProgramming 10 312–321.
[67] Horst, R., P. M. Pardalos. 1995. Handbook of Global Optimization. Kluwer AcademicPublishers.
[68] Horst, R., N. V. Thoai, H. Tuy. 1989. On an outer-approximation in globaloptimization. Optimization 20 255–264.
[69] Horst, R., H. Tuy. 1996. Global Optimization: Deterministic Approaches . Third ed.Springer Verlag, Berlin.
[70] Kallrath, J. 2005. Solving planning and design problems in the process industryusing mixed integer and global optimization. Annals of Operations Research 140339–373.
[71] Kan, A. H. G. Rinnooy, G. T. Timmer. 1987. Stochastic global optimizationmethods I: Clustering methods. Mathematical Programming 39 27–56.
[72] Kantorovich, L. V. 1960. Mathematical methods in the organization and planning ofproduction. Management Science 6 366–422.
[73] Karmarkar, N. 1984. A new polynomial time algorithm for linear programming.Combinatorica 4 375–395.
[74] Khachiyan, L. G. 1979. A polynomial time algorithm for linear programming. SovietMathematics Doklady 20 191–194.
[75] Land, A. H., A. G. Doig. 1960. An automatic method for solving discreteprogramming problems. Econometrica 28 497–520.
[76] Lenstra, H. W. 1983. Integer programming with a fixed number of variables.Mathematics of Operations Research 8 538–548.
[77] Liberti, L., C. Lavor, N. Maculan. 2008. A branch-and-bound algorithm for themolecular distance geometry problem. International Transactions in OperationalResearch 15 1–17.
206
[78] Liberti, L., C. Lavor, N. Maculan, M-A. C. Nascimento. 2009. Reformulation inmathematical programming: an application to quantum chemisty. Discrete ApppliedMathematics 6 1309–1318.
[79] Linderoth, J. T., M. W. P. Savelsbergh. 1999. A computational study of strategiesfor mixed integer programming. INFORMS Journal on Computing 11 173–187.
[80] LINDO Systems Inc. 2008. LINGO 11.0 optimization modeling software for linear,nonlinear, and integer programming. Available at http://www.lindo.com.
[81] Louveaux, Q., L. A. Wolsey. 2007. Lifting, superadditivity, mixed integer roundingand single node flow sets revisited. Annals of Operations Research 153 47–77.
[82] Lovasz, L., A. Schrijver. 1991. Cones of matrices and set functions and 0−1optimization. SIAM Journal on Optimization 1 166–190.
[83] Marchand, H., L. A. Wolsey. 1999. The 0−1 knapsack problem with a singlecontinuous variable. Mathematical Programming 85 15–33.
[84] Martin, A. 2001. General mixed integer programming: Computational issuesfor branch-and-cut algorithms. D. Naddef, M. Juenger, eds., ComputationalCombinatorial Optimization. Springer.
[85] McCormick, G. P. 1976. Computability of global solutions to factorable nonconvexprograms: Part I - convex underestimating problems. Mathematical Programming 10147–175.
[86] McCormick, G. P. 1983. Nonlinear Programming: Theory, Algorithms, and Applica-tions . John Wiley and Sons.
[87] Meyer, R. R. 1974. On the existence of optimal solutions to integer andmixed-integer programming problems. Mathematical Programming 7 223–235.
[88] Minkowski, H. 1896. Geometric der zahlen. Working paper.
[89] Mitsos, A., B. Chachuat, P. I. Barton. 2009. McCormick-based relaxations ofalgorithms. SIAM Journal on Optimization 20 573–601.
[90] Mittelmann, H. 2010. Benchmarks for optimization software. Available at http://plato.asu.edu/bench.html.
[91] Nemhauser, G. L., L. A. Wolsey. 1988. Integer and Combinatorial Optimization.Wiley Interscience, New York.
[92] Nesterov, Y., A. Nemirovskii. 1994. Interior-Point Polynomial Algorithms in ConvexProgramming . SIAM.
[93] Neumaier, A. 1997. Molecular modeling of proteins and mathematical prediction ofprotein structure. SIAM Review 39 407–460.
207
[94] Neumaier, A. 2004. Complete search in continuous global optimization andconstraint satisfaction. Acta Numerica 13 271–369.
[95] Padberg, M. W. 1973. On the facial structure of set packing polyhedra. Mathemati-cal Programming 5 199–215.
[96] Padberg, M. W. 1975. A note on zero-one programming. Operations Research 23833–837.
[97] Padberg, M. W., T. J. Van Roy, L. A. Wolsey. 1985. Valid linear inequalities forfixed charge problems. Operations Research 33 842–861.
[98] Richard, J.-P. P., I. R. de Faris, G. L. Nemhauser. 2003. Lifted inequalities for0−1 mixed integer programmming: Basic theory and algorithms. MathematicalProgramming 98 89–113.
[99] Richard, J.-P. P., I. R. de Faris, G. L. Nemhauser. 2003. Lifted inequalities for 0−1mixed integer programmming: Superlinear lifting. Mathematical Programming 98115–143.
[100] Richard, J.-P. P., M. Tawarmalani. 2010. Lifting inequalities: A framework forgenerating strong cuts for nonlinear programs. Mathematical Programming 12161–104.
[101] Rikun, A. D. 1997. A convex envelope formula for multilinear functions. Journal ofGlobal Optimization 10 425–437.
[102] Rockafellar, R. T. 1970. Convex Analysis . Princeton University Press.
[103] Ryoo, H. S., N. V. Sahinidis. 1996. A branch-and-reduce approach to globaloptimization. Journal of Global Optimization 8 107–139.
[104] Ryoo, H. S., N. V. Sahinidis. 2001. Analysis of bounds for multilinear functions.Journal of Global Optimization 19 403–424.
[105] Sahinidis, N. V., M. Tawarmalani. 2005. BARON . The Optimization Firm, LLC,Urbana-Champaign, IL. Available at http://www.gams.com/dd/docs/solvers/baron.pdf.
[106] Savelsbergh, M. W. P. 1994. Preprocessing and probing for mixed integerprogramming problems. ORSA Journal on Computing 6 445–454.
[107] Sawaya, N. W., I. E. Grossmann. 2005. A cutting plane method for solving lineargeneralized disjunctive programming problems. Computers and Chemical Engineer-ing 29 1891–1913.
[108] Schrijver, A. 1986. Theory of Linear and Integer Programming . John Wiley & Sons,Chichester.
208
[109] Shectman, J. P., N. V. Sahinidis. 1998. A finite algorithm for global minimization ofseparable concave programs. Journal of Global Optimization 12 1–36.
[110] Sherali, H. D., W. P. Adams. 1990. A hierarchy of relaxations between thecontinuous and convex hull representations for zero-one programming problems.SIAM Journal on Discrete Mathematics 3 411–430.
[111] Sherali, H. D., S. Sen. 1985. Cuts from combinatorial disjunctions. OperationsResearch 33 928–933.
[112] Shor, N. Z. 1977. Cut-off method with space extension in convex programmingproblems. Cybernetics 13 94–96.
[113] Stubbs, R., S. Mehrotra. 1999. A branch-and-cut method for 0−1 mixed convexprogramming. Mathematical Programming 86 515–532.
[114] Tawarmalani, M., , S. Ahmed, N. V. Sahinidis. 2002. Global optimization of 0−1hyperbolic programs. Journal of Global Optimization 24 385–417.
[115] Tawarmalani, M., , S. Ahmed, N. V. Sahinidis. 2002. Product disaggregation andrelaxations of mixed-integer rational programs. Journal of Global Optimization 24385–417.
[116] Tawarmalani, M. 2001. Mixed integer nonlinear programs: Theory, algorithms, andapplications. Ph.D. thesis, University of Illinois, Urbana-Champaign, IL.
[117] Tawarmalani, M., J.-P. P. Richard, K. Chung. 2008. Strong Valid Inequalities forOrthogonal Disjunctions and Polynomial Covering Sets. Technical Report, KrannertSchool of Management, Purdue University.
[118] Tawarmalani, M., J.-P. P. Richard, K. Chung. 2010. Strong valid inequalities fororthogonal disjunctions and bilinear covering sets. Mathematical ProgrammingForthcoming.
[119] Tawarmalani, M., N. V. Sahinidis. 2001. Semidefinite relaxations of fractionalprograms via novel techniques for constructing convex envelopes of nonlinearfunctions. Journal of Global Optimization 2001 137–158.
[120] Tawarmalani, M., N. V. Sahinidis. 2002. Convex extensions and envelopes of lowersemi-continuous functions. Mathematical Programming 93 247–263.
[121] Tawarmalani, M., N. V. Sahinidis. 2002. Convexification and Global Optimizationin Continuous and Mixed-Integer Nonlinear Programming: Theory, Algorithms,Software, and Applications . Kluwer, Dordrecht, The Netherlands.
[122] Tawarmalani, M., N. V. Sahinidis. 2004. Global optimization of mixed-integernonlinear programs: A theoretical and computational study. Mathematical Program-ming 99 563–591.
209
[123] Tawarmalani, M., N. V. Sahinidis. 2005. A polyhedral branch-and-cut approach toglobal optimization. Mathematical Programming 103 225–249.
[124] Tuy, H. 1985. A concave programming under linear constraints. Doklady AkademicNauk 159 32–35.
[125] Tuy, H. 1987. Global optimization of a difference of two convex functions. Mathe-matical Programming Study 30 150–182.
[126] Tuy, H., T. V. Thieu, N. Q. Thai. 1985. A conical algorithm for globally minimizinga concave function over a closed convex set. Mathematics of Operations Research 10498–514.
[127] Van Roy, T. J., L. A. Wolsey. 1986. Valid inequalities for mixed 0−1 programs.Discrete Applied Mathematics 14 199–213.
[128] Vandenbussche, D., G. L. Nemhauser. 2005. A polyhedral study of nonconvexquadratic programs with box constraints. Mathematical Programming 102 531–557.
[129] Visweswaran, V., C. A. Floudas. 1993. New properties and computationalimprovement of the GOP algorithm for problems with quadratic objective functionsand constraints. Journal of Global Optimization 3 439–462.
[130] Wolsey, L. A. 1975. Faces for a linear inequality in 0−1 variables. MathematicalProgramming 8 165–178.
[131] Wolsey, L. A. 1976. Facets and strong valid inequalities for integer programs.Operations Research 24 362–372.
[132] Wolsey, L. A. 1977. Valid inequalities and superadditivity for 0−1 integer programs.Mathematics of Operations Research 2 66–77.
[133] Wright, M. H. 2004. The interior-point revolution in optimiation: History, recentdevelopments, and lasting consequences. Bulletin of the American MathematicalSociety 42 39–56.
[134] Wu, S.-J., P.-T. Chow. 1995. Genetic algorithms for nonlinear mixed discrete-integeroptimization problems via meta-genetic parameter optimization. EngineeringOptimization 24 137–159.
[135] You, F., I. E. Grossmann. 2009. Mixed-integer nonlinear programming models andalgorithms for supply chain design with stochastic inventory management. Availableat http://www.minlp.org/library/problem/index.php?i=30.
[136] Yudin, D. B., A. S. Nemirovski. 1976. Informational complexity and effectivemethods of solution of convex extremal problems. Economics and MathematicalMethods 12 357–369.
210
[137] Zamora, J. M., I. E. Grossmann. 1999. A branch and contract algorithm forproblems with concave univariate, bilinear and linear fractional terms. Journal ofGlobal Optimization 14 217–249.
[138] Zemel, E. 1978. Lifting the facets of zero-one polytopes. Mathematical Programming15 268–277.
[139] Zeng, B. 2007. Efficient lifting methods for unstructured mixed integer programswith multiple constraints. Ph.D. thesis, Purdue University, West Lafayette, IN.
[140] Zhang, C., H.-P. Wang. 1993. Mixed-discrete nonlinear optimization with simulatedannealing. Engineering Optimization 21 277–291.
[141] Ziegler, G. M. 1998. Lectures on Polytopes . Springer, NY.
[142] Zondervan, E. 2009. A deterministic security constrained unit commitment model.Available at http://www.minlp.org/library/problem/index.php?i=41.
211
BIOGRAPHICAL SKETCH
Kwanghun Chung was born in Iksan, Korea and spent most of his life in Daejeon,
Korea. He received both Bachelor of Science and Master of Science degrees in industrial
engineering from Seoul National University in 1997 and 1999, respectively. After working
as a software engineer in the Research and Development Center at Samsung SDS from
1999 to 2004, he joined the School of Industrial Engineering at Purdue University as a
Ph.D. student in 2004. His doctoral research focuses on developing and improving solution
methodologies for Mixed-Integer Nonlinear Programs. In 2008, he transferred to the
Department of Industrial and Systems Engineering at the University of Florida (UF)
following his advisor, Dr. Richard. After receiving a Doctor of Philosophy degree in the
area of Operations Research from UF on August 2010, he joined the Center for Operations
Research and Econometrics (CORE) at the Universite Catholique de Louvain in Belgium
as a postdoctoral fellow.
212