STRONG VALID INEQUALITIES FOR MIXED-INTEGER NONLINEAR...

STRONG VALID INEQUALITIES FOR MIXED-INTEGER NONLINEAR PROGRAMSVIA DISJUNCTIVE PROGRAMMING AND LIFTING

By

KWANGHUN CHUNG

A DISSERTATION PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2010

c© 2010 Kwanghun Chung

2

To my father, Youngkwan Chung, and my mother, Haeja Hwangbo

3

ACKNOWLEDGEMENTS

It is my great pleasure to thank all the people who helped me successfully complete

this thesis. First, I would like to deeply thank my advisor, Dr. Jean-Philippe P. Richard,

for advising me with enthusiasm and patience during my Ph.D. study. He always inspired

and encouraged me to pursue my research whenever I was frustrated with difficulties.

While I worked with him, I learned a lot about Operations Research from his knowledge

and academia from his experience. I set him as a role model that I wish to follow if I work

in academia.

I would like to thank my co-advisor, Dr. Mohit Tawarmalani, for his guidance and

discussions that make it possible for me to write this thesis. His critical and rigorous way

of thinking motivated me to overcome various obstacles. I am also thankful to Dr. Panos

Pardalos, Dr. J. Cole Smith, and Dr. William Hager, for serving on my committee and

giving me helpful comments to improve the quality of this thesis.

My life as a doctoral student for the last few years has been happy and pleasant

because of many of my friends at Purdue University and the University of Florida. In

particular, I appreciate Seokcheon, Byungcheol, Keumseok, Daiki, Kyungdoh, Sangbok,

and all members of Purdue Korean Industrial Engineers for their help and support that

alleviated me from the painful research. I also appreciate Chanmin and Youngwoong as

well as my fellow office mates for many kind favors during my stay in Florida.

Finally, I would like to show my gratitude to all of my families, Youngkwan, Haeja,

Hyunjoo, and Jaehun, whose sincere love and constant supports are the source of my life.

4

TABLE OF CONTENTS

page

ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

CHAPTER

1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.1 Mixed-Integer Nonlinear Program (MINLP) . . . . . . . . . . . . . . . . . 121.1.1 Models and Applications . . . . . . . . . . . . . . . . . . . . . . . . 121.1.2 Solution Methodologies to Global Optimization . . . . . . . . . . . 14

1.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.2.1 Well-Solved Optimization Problems . . . . . . . . . . . . . . . . . . 151.2.2 Relaxations and Convexifications . . . . . . . . . . . . . . . . . . . 20

1.3 Branch-and-Cut in MINLP . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.3.1 Bounding Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.3.2 Branching Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.3.3 Cutting Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271.3.4 Domain Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

1.4 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2 CONVEX RELAXATIONS IN MILP and MINLP . . . . . . . . . . . . . . . . . 30

2.1 Convexification Methods in MINLP . . . . . . . . . . . . . . . . . . . . . . 302.1.1 Convex Envelopes and Convex Extensions . . . . . . . . . . . . . . 312.1.2 Reformulation and Relaxation . . . . . . . . . . . . . . . . . . . . . 33

2.2 Cutting Plane Techniques for Mixed-Integer Linear Program (MILP) . . . 372.2.1 Disjunctive Programming . . . . . . . . . . . . . . . . . . . . . . . . 402.2.2 Lifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.2.2.1 Sequential lifting . . . . . . . . . . . . . . . . . . . . . . . 472.2.2.2 Sequence-independent lifting . . . . . . . . . . . . . . . . . 50

3 MOTIVATION AND RESEARCH STATEMENTS . . . . . . . . . . . . . . . . 51

3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.2 Problem Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.2.1 Strong Valid Inequalities for Orthogonal Disjunctions and BilinearCovering Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.2.2 Lifted Inequalities for 0-1 Mixed-Integer Bilinear Covering Sets withBounded Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5

4 STRONG VALID INEQUALITIES FOR ORTHOGONAL DISJUNCTIONSAND BILINEAR COVERING SETS . . . . . . . . . . . . . . . . . . . . . . . . 56

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.2 Convexification of Orthogonal Disjunctive Sets . . . . . . . . . . . . . . . . 584.3 Convex Extension Property . . . . . . . . . . . . . . . . . . . . . . . . . . 844.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5 LIFTED INEQUALITIES FOR 0-1 MIXED-INTEGER BILINEAR COVERINGSETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1065.2 Basic Polyhedral Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1095.3 Lifted Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

5.3.1 Sequence-Independent Lifting for Bilinear Covering Sets . . . . . . . 1215.3.2 Lifted Inequalities by Sequence-Independent Lifting . . . . . . . . . 123

5.3.2.1 Lifted bilinear cover inequalities . . . . . . . . . . . . . . . 1275.3.2.2 Lifted reverse bilinear cover inequalities . . . . . . . . . . 137

5.3.3 Inequalities through Approximate Lifting . . . . . . . . . . . . . . . 1445.4 New Facet-Defining Inequalities for a Single-node Flow Model . . . . . . . 1545.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

6 A COMPUTATIONAL STUDY OF LIFTED INEQUALITIES FOR 0-1 BILINEARCOVERING SETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1646.2 Generalization to Bilinear Constraints with Linear Terms . . . . . . . . . . 164

6.2.1 Generalized Lifted Bilinear Cover Inequalities . . . . . . . . . . . . 1726.2.2 Generalized Lifted Reverse Bilinear Cover Inequalities . . . . . . . . 179

6.3 Preliminary Computational Study . . . . . . . . . . . . . . . . . . . . . . . 1826.3.1 Computational Environments . . . . . . . . . . . . . . . . . . . . . 1836.3.2 Testing Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1836.3.3 Separation Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 1856.3.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

6.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

7 CONCLUSIONS AND FUTURE RESEARCH . . . . . . . . . . . . . . . . . . . 194

7.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1947.2 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

APPENDIX

A LINEAR DESCRIPTION OF THE CONVEX HULL OF A BILINEAR SET . . 197

B LINEAR DESCRIPTION OF THE CONVEX HULL OF A FLOW SET . . . . 200

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

6

BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

7

LIST OF TABLES

Table page

6-1 Parameters of the random instances for three test sets . . . . . . . . . . . . . . 184

6-2 Characteristics of the three test sets . . . . . . . . . . . . . . . . . . . . . . . . . 185

6-3 Objective values to the test instances . . . . . . . . . . . . . . . . . . . . . . . . 186

6-4 Performance of lifted cuts on small size instances . . . . . . . . . . . . . . . . . 190

6-5 Performance of lifted cuts on medium size instances . . . . . . . . . . . . . . . . 191

6-6 Performance of lifted cuts on large size instances . . . . . . . . . . . . . . . . . . 192

8

LIST OF FIGURES

Figure page

1-1 Branch-and-Cut framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2-1 Cutting plane algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3-1 Geometric illustration of S, conv(S), S1 and S2 . . . . . . . . . . . . . . . . . . 52

4-1 Illustration of Theorem 4.1 with (a) J1 6= ∅, J2 6= ∅ (b) J2 = ∅ (c) J1 = J2 = ∅ . 70

4-2 Facet-defining inequalities for conv(BI

i

). . . . . . . . . . . . . . . . . . . . . . 93

5-1 Lifting function PC(w) of (5–44) . . . . . . . . . . . . . . . . . . . . . . . . . . 134

5-2 Deriving lifting coefficients for Example 5.3 . . . . . . . . . . . . . . . . . . . . 134


5-4 A valid subadditive approximation Ψ(w) of Φ(w) for Example 5.6. . . . . . . . . 151

6-1 Lifting function LC(w) of (6–19) . . . . . . . . . . . . . . . . . . . . . . . . . . . 175


9

Abstract of Dissertation Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of theRequirements for the Degree of Doctor of Philosophy

STRONG VALID INEQUALITIES FOR MIXED-INTEGER NONLINEAR PROGRAMSVIA DISJUNCTIVE PROGRAMMING AND LIFTING

By

Kwanghun Chung

August 2010

Chair: Jean-Philippe. P. RichardMajor: Industrial and Systems Engineering

Mixed-Integer Nonlinear Programs (MINLP) are optimization problems that have

found applications in virtually all sectors of the economy. Although these models can be

used to design and improve a large array of practical systems, they are typically difficult

to solve to global optimality. In this thesis, we introduce new tools for the solution of such

problems. In particular, we develop new procedures to construct convex relaxations of

certain MINLP problems. These relaxations are stronger than those currently known for

these problems and therefore provide improvements in the solution of MINLPs through

branch-and-bound techniques. There are three main components to our contributions.

First, we derive a closed-form characterization of the convex hull of a generic

nonlinear set, when the convex hull of this set is completely determined by orthogonal

restrictions of the original set. Although the tools used in our derivation include

disjunctive programming and convex extensions, our characterization does not introduce

additional variables. We develop and apply a toolbox of results to check the technical

assumptions under which this convexification tool can be employed. We demonstrate

its applicability in integer programming by providing an alternate derivation of the

split cut for mixed-integer polyhedral sets and by finding the convex hull of various

mixed/pure-integer bilinear sets. We then develop a key result that extends the utility

of the convexification tool to relaxing nonconvex inequalities, which are not naturally

disjunctive, by providing sufficient conditions for establishing the convex extension

10

property over the non-negative orthant. We illustrate the utility of this result by deriving

the convex hull of a continuous bilinear covering set over the non-negative orthant.

Second, we study the 0−1 mixed-integer bilinear covering set. We show that the

convex hull of this set is polyhedral and we provide characterizations for its trivial

facets. We also obtain a complete convex hull description when it contains only two

pairs of variables. We then derive three families of facet-defining inequalities via

sequence-independent lifting techniques. Two of these families have an exponential

number of members. Next, we relate the polyhedral structure of the 0−1 mixed-integer

bilinear covering set to that of certain single-node flow sets. As a result, we obtain

new facet-defining inequalities for flow sets that generalize well-known lifted flow cover

inequalities from the integer programming literature.

Third, we evaluate the strength of the lifted inequalities we derive for 0−1 mixed-integer

bilinear covering sets inside of a branch-and-cut framework. To this end, we first generalize

our theoretical results to bilinear covering sets that have additional linear terms. We

then present separation techniques for lifted inequalities and report computational results

obtained when using these procedures on several families of randomly generated problems.

11

CHAPTER 1INTRODUCTION

In this chapter, we give a brief overview of Mixed-Integer Nonlinear Programming

models and their applications. We then describe general methodologies to solve them.

After discussing basic concepts in mathematical programming, we describe in more detail

the branch-and-bound approach to MINLP. We conclude this chapter by describing the

overall structure of this thesis.

1.1 Mixed-Integer Nonlinear Program (MINLP)

1.1.1 Models and Applications

A Mixed-Integer Nonlinear Program (MINLP) is an optimization problem of the form:

min f(x)

(P ) s.t. gi(x) ≤ 0 ∀i ∈ M,

xj ∈ Z+ ∀j ∈ I ⊆ N := {1, . . . , n},xj ∈ R+ ∀j ∈ N \ I,

where

1. f : Rn 7→ R,

2. gi : Rn 7→ R, ∀i ∈ M .

Throughout the thesis, we restrict our attention to problems (P ), where the functions f

and gi are continuous and factorable.

Definition 1.1 (Factorable Function [89]). A function is factorable if it is defined by

a finite recursive composition of binary sums, binary products, and a given collection of

univariate intrinsic functions.

For example, the function f(x) = x1ex2 + cos(x1 + x2)x3 is factorable since it can be

expressed as

f(x) = S(P(x1, h1(x2)

),P

(h2(S(x1, x2), x3)

)),

where

12

• S(x, y) = x+ y represents a binary sum,

• P(x, y) = x ∗ y represents a binary product,

• h1(x) = ex and h2(x) = cos(x) are intrinsic univariate functions.

The class of factorable functions contains most functions encountered in practical

applications; see McCormick [86].

We refer to x ∈ Rn as the decision variables of (P ). We refer to f(x) as the objective

function of (P ) and to gi(x) ≥ 0 for i ∈ M as the constraints of (P ). If there are no

constraints (i.e., M = ∅), we say that problem (P ) is unconstrained.

We define

S :={x ∈ Z|I|

+ × Rn−|I|+

∣∣∣ gi(x) ≤ 0 ∀i ∈ M}

to be the feasible region of (P ). A vector x ∈ S is said to be a feasible solution of (P ).

Further, problem (P ) is said to be feasible if S 6= ∅, and infeasible if S = ∅.The goal of problem (P ) is to find a vector x∗ ∈ S, called a (globally) optimal solution

of (P ) whose objective value f(x∗) is minimized over the set S, i.e.,

f(x∗) ≤ f(x) ∀x ∈ S.

We refer to f(x∗) as the optimal value of (P ). A vector x ∈ S is said to be locally optimal

if there exists an ε > 0 such that

f(x) ≤ f(x) ∀x ∈ S ∩{x ∈ Rn

∣∣∣ ‖x− x‖ ≤ ε}.

In this thesis, we will use the terms optimal and globally optimal interchangeably.

When f is linear (i.e., f(x) = cTx) and all of the functions gi are affine (i.e.,

gi(x) = (ai)Tx + bi), (P ) is said to be a Mixed-Integer Linear Program (MILP). When

I = ∅, (P ) is referred to as a Linear Program (LP). When I = N , (P ) is said to be a Pure

Integer Program (IP). Finally, when all variables xj for j ∈ I are restricted to be binary,

(P ) is commonly known as a 0−1 Mixed-Integer Linear Program or Binary Mixed-Integer

Linear Program (BMILP). While LP problems can be solved in polynomial-time, solving

13

general MILPs is NP-hard; see Cook [35]. Note however that when the number of variables

is fixed, Lenstra [76] describes a polynomial-time algorithm for IP.

In MINLP models, continuous variables are typically used to represent physical

quantities while binary variables are used to describe managerial decisions. Functions f(x)

and gi(x) are used to capture the (possibly nonlinear) physical relations between these

variables. As a result, MINLP problems arise in a wide variety of practical applications

and are used to model decision problems in business and engineering. Successful

applications of MINLP can be found in a number of fields such as telecommunication

networks [25], supply chain design and management [135], portfolio optimization [39],

chemical processes [27, 53, 70], protein folding [93], molecular biology [77], quantum

chemistry [78], and unit commitment problems [142].

1.1.2 Solution Methodologies to Global Optimization

Global optimization of MINLPs is typically difficult when (1) there are integrality

restrictions on a subset of variables (i.e., I 6= ∅) and (2) there are nonconvex functions (see

definition in Section 1.2.1) either in the objective or in the constraints. General solution

methodologies to obtain globally optimal solutions for MINLPs can be classified as either

deterministic or stochastic; see Neumaier [94] for a survey of existing solution methods.

Deterministic algorithms include branch-and-bound [51, 85, 103], outer-approximation

[48, 65, 68], cutting planes [124, 126], and decomposition [125, 129]. Stochastic approaches

include random search [140], genetic algorithms [134], and clustering algorithms [71].

For detailed presentations of these approaches, we refer the interested reader to the

books of Horst and Pardalos [67] and Horst and Tuy [69]. In this thesis, we will focus on

branch-and-bound approaches for MINLP.

1.2 Preliminaries

In this section, we briefly review fundamental results in mathematical programming

that are used throughout this thesis.

14

1.2.1 Well-Solved Optimization Problems

Since MINLP is known to be NP-hard, it is unlikely that we will ever be able to

design an algorithm that solves all instances of (P ) to global optimality in polynomial

time. However, there are families of problems (P ) that can be solved efficiently. We

introduce two such families next. To this end, we introduce next the notion of convex set

and convex function.

Definition 1.2 (Convex Combination). Let x1, . . . , xp be vectors in Rn. We refer to any

point x obtained as∑p

j=1 λjxj where λj ∈ R+ for j = 1, . . . , p and

∑pj=1 λj = 1 as a convex

combination of x1, . . . , xp.

Definition 1.3 (Convex Set). A set S ⊆ Rn is said to be convex if, ∀x1, x2 ∈ S, all convex

combinations of x1 and x2 belong to S, i.e.,

λx1 + (1− λ)x2 ∈ S, ∀λ ∈ [0, 1].

Definition 1.4 (Convex Function). Let S be a nonempty convex subset of Rn. A function

f : S 7→ R is said to be convex if, ∀x1, x2 ∈ S,

f(λx1 + (1− λ)x2

)≤ λf(x1) + (1− λ)f(x2), ∀λ ∈ [0, 1].

Convex sets and convex functions can be related in different ways; see Section 3.1 of

Bazaraa et al. [24] for a textbook discussion. We present one such relation next.

Definition 1.5 (Level Set). Given a function f : Rn 7→ R and a scalar α ∈ R, we refer to

the set

Sα ={x ∈ Rn

∣∣∣ f(x) ≤ α}

as the α level set of f .

Proposition 1.1. Let f : Rn 7→ R be a convex function. Then, the α level set of f is a

convex set for each value of α ∈ R.

15

We now focus on a subfamily of problems (P ) of the form

min f(x)

(CP ) s.t. gi(x) ≤ 0 ∀i ∈ M,

xj ∈ R+ ∀j ∈ N,

where the functions f(x) and gi(x) for i ∈ M are convex. Proposition 1.1 implies that the

feasible region of (CP ) is a convex set since the intersection of convex sets is convex. We

refer to such problems as convex programs. While (CP ) cannot typically be solved with an

analytical formula, (CP ) has many good properties that make finding a globally optimal

solution easier than finding that of other problems (P ). In particular, it can be shown that

every local optimal solution of (CP ) is also globally optimal. There are various methods

to solve (CP ) to global optimality; see Boyd and Vandenberghe [29] and Nesterov and

Nemirovskii [92]. We briefly comment on two of these methods.

The ellipsoid method was formally developed by Yudin and Nemirovski [136] although

similar ideas had been introduced earlier by Shor [112]. The ellipsoid method generates

a sequence of ellipsoids containing an optimal solution of the problem whose volumes are

decreasing. At each iteration, the algorithm splits the current ellipsoid in half and use

problem information to determine which half of the ellipsoid contains an optimal solution.

A new ellipsoid (of smaller volume) is then built around the selected half-ellipsoid and the

process is iterated.

Interior point algorithms form another family of solution approaches for convex

programs. The idea originates from the work of Fiacco and McCormick [52] in the

1960’s. Among others, the authors include barrier functions in the objective to take into

account the feasible region of the problem (CP ). Although progress on these techniques

remained limited through the 1980’s, the discovery of a polynomial-time algorithm for

linear programs by Karmarkar [73] led to a revival of interests in barrier methods. In

particular, Nesterov and Nemirovskii [92] later showed that polynomial time convergence

can be achieved for any convex program that can be equipped with an easily computable

16

self-concordant barrier functions. Simple self-concordant barriers are known for many

convex programs; see Nesterov and Nemirovskii [92]. As a result, convex programs are

typically thought to be simple optimization problems to solve.

A particular type of convex programs that is very simple is the linear programming

problem. This problem is a variant of (P ) of the form:

min cTx

(LP ) s.t. Ax ≤ b

x ∈ Rn+.

Before we discuss algorithms to solve LPs, we introduce some basic concepts of polyhedral

theory that we will use later in this thesis.

Definition 1.6 (Polyhedron and Polytope). A polyhedron Q ⊆ Rn is a set of points in Rn

that can be described as the intersection of a finite number of half-spaces, i.e.,

Q ={x ∈ Rn

∣∣∣ Ax ≤ b}, (1–1)

where A ∈ Rm×n and b ∈ Rm. A polyhedron is said to be bounded if there exists M ∈ R+

such that sup{‖x‖

∣∣∣ x ∈ Q}

< M . We typically refer to a bounded polyhedron as a

polytope.

It is clear that the feasible region of an LP is a polyhedron. When studying MILPs,

we will typically consider rational polyhedra, i.e., polyhedra that can be defined with

A ∈ Qm×n and b ∈ Qm. When studying a polyhedron, some feasible solutions are of

particular interest.

Definition 1.7 (Extreme Point). A point x in a polyhedron Q is said to be an extreme

point of Q if whenever x = 12x1 + 1

2x2 for some x1, x2 ∈ Q, then x = x1 = x2.

17

Definition 1.8 (Extreme Ray). Given the nonempty polyhedron Q defined in (1–1), we

define the recession cone of Q as

Q0 ={r ∈ Rn

∣∣∣ Ar ≤ 0}.

A non-zero vector r in Q0 is said to be a ray of Q. Further, a ray r is said to be an

extreme ray of Q if whenever r = 12r1 + 1

2r2 for some r1, r2 ∈ Q0, then r = r1 = r2.

Polyhedra can be represented using extreme points and extreme rays as presented in

Theorem 1.1.

Theorem 1.1 (Minkowski’s Theorem [88]). If Q is a nonempty polyhedron as defined in

(1–1) and rank(A) = n, then

Q =

x ∈ Rn

∣∣∣∣∣∣∣∣∣∣∣∣∣

x =∑

k∈K λkxk +

∑j∈J µjr

j,

∑k∈K λk = 1,

λk ≥ 0, ∀ k ∈ K,

µj ≥ 0, ∀ j ∈ J,

where {xk}k∈K is the set of extreme points of Q and {rj}j∈J is the set of extreme rays of

Q.

Using Theorem 1.1, we can easily verify the following result.

Theorem 1.2. If (LP ) has an optimal solution, then at least one of the extreme points of

Q must be an optimal solution.

Using the fact that an optimal solution to (LP ) can be found among the extreme

points of its feasible region, Danzig [44] developed in 1947 the first algorithm to solve

general LPs: the simplex algorithm. We mention that Kantorovich had proposed earlier in

1939 a method to solve a restricted form of LPs; see [72] for a translation.

The simplex algorithm relies on the observation that every extreme point of LPs of

the form

(LP ′) min{cTx

∣∣∣ Ax = b, x ∈ Rn+

}, (1–2)

18

can be computed as

xB = A−1B b, xN = 0, (1–3)

where B ⊆ N , N = N \ B, AB is an invertible submatrix of A formed by the columns of

A corresponding to B, and A−1B b ≥ 0. In (1–3), the variables xj for j ∈ B are called basic

while the variables xj for j ∈ N are called nonbasic.

The simplex algorithm searches for an optimal solution of (LP ′) by creating a

sequence of bases B1,B2, . . . ,Bk that are such that (i) |Bi ∩ Bi+1| = |Bi| − 1 = |Bi+1| − 1

and (ii) the basic solutions corresponding to the Bis are feasible and nonincreasing. The

operation of moving from one basis to the next is called pivoting.

Using an appropriate pivoting strategy such as Bland’s rule [28], the simplex

algorithm obtains an optimal solution to (LP ′) in a finite number of iterations. Over

the years, many different pivoting rules have been developed but none of them has been

shown to provide a polynomial-time algorithm to solve LPs. Nonetheless, the simplex

algorithm is typically very efficient at solving practical LP problems. In 1979, Khachiyan

[74] proposed the first polynomial time algorithm for LPs. This algorithm is a specialized

variant of the ellipsoid algorithm. Although the practical performance of this algorithm is

poor, it is remarkable that it does not depend directly on the number of constraints the

LP has. This feature has important consequences in the study of integer programs that

we will comment on in Section 2.2. Karmarkar [73] introduced the first algorithm for the

solution of LPs that has good performance in both theory and practice. Improvements

and variants of this algorithm were subsequently discovered; see Wright [133]. Nowadays,

commercial software such as CPLEX [40] use a combination of simplex algorithm and

interior points methods to solve LPs and can solve large instances of practical problems

very quickly; see Mittelmann [90]. As a result, they can be used as the working horse for

the solution of other more difficult problems.

19

1.2.2 Relaxations and Convexifications

One of the ways to prove that z is the optimal value of (P ) is to show z is both a

lower and a upper bound on the optimal value z∗. Upper bounds (also called primal

bounds) can be obtained from any feasible solution xF ∈ S since z∗ ≤ f(xF ). To

obtain tight upper bounds, we need to find good feasible solutions, which can be difficult

depending on the original problem (P ). Heuristic approaches are typically used for this

purpose. Finding lower bounds (also called dual bounds) requires other techniques. A

common approach is to use relaxations. We give a formal definition of relaxation next.

Definition 1.9 (Relaxation). Given an optimization problem

(P ) z∗ = min{f(x) | x ∈ S},

the related optimization problem

(RP ) zR = min{f(x) | x ∈ R}

is said to be a relaxation of (P ) if

1. S ⊆ R,

2. f(x) ≥ f(x) for all x ∈ S.

Definition 1.9 states that relaxations can be obtained in two ways; (i) by enlarging

the feasible region S and/or (ii) by underestimating the objective function f(x) over S.

Lower bounds can be obtained by solving relaxations, as the following result suggests.

Proposition 1.2. If (RP ) is a relaxation of (P ), then zR ≤ z∗.

Although optimal solutions of relaxations are not always optimal for the original

problem, they sometimes are. The following result handles this issue.

Proposition 1.3. If x∗ is an optimal solution of (RP ), x∗ ∈ S, and f(x∗) = f(x∗), then x∗

is an optimal solution of (P ).

The derivation of a relaxation is particularly useful if the problem associated with

the relaxation is substantially easier to solve than the original problem and the relaxation

20

value zR is close to z∗. Given the fact that convex programs are typically easy to solve, it

makes sense to study how to construct convex relaxations of optimization problems. To

obtain the tightest possible relaxation bound, it is best to replace S by the smallest convex

set that contains S, which is called convex hull. An alternate definition is as follows.

Definition 1.10 (Convex Hull). Let S ⊆ Rn. We refer to the set of all convex combina-

tions of points in S, which we denote by conv(S), as the convex hull of S.

When underestimating the objective function f(x), it is also clear that, to obtain

the tightest relaxation possible, we should replace f(x) with the tightest convex lower

approximation of f(x), which is commonly known as convex envelope; see Falk [49],

Rockafellar [102], Horst [66], and Horst and Tuy [69].

Definition 1.11 (Convex Envelope). Let S ⊆ Rn be convex and compact, and let

f : S 7→ R be lower semi-continuous on S. A function convenv(f) : S 7→ R is called the

convex envelope of f on S, denoted as convenv(f), if it satisfies

1. convenv(f(x)) is convex on S,

2. convenv(f(x)) ≤ f(x), ∀x ∈ S,

3. there is no function g : S 7→ R satisfying (i), (ii), and convenv(f(x)) < g(x) for

some x ∈ S.

Note that it is easily seen from Condition 3 that the convex envelope is uniquely

determined, if it exists.

In theory, a very strong relaxation of the problem (P ) can be obtained by replacing

the feasible region S with conv(S) and the objective function f(x) with convenv(f(x)).

However, such a construction is typically not practical as deriving convex hulls of

nonconvex sets and convex envelopes of nonconvex functions is often difficult. Therefore,

simpler and weaker convex relaxations are typically derived. We call these relaxations

convexifications.

21

Definition 1.12 (Convexification). Given a nonconvex problem

(NCP )z∗ = min f(x)

s.t. x ∈ S,

a problem of (P )

(CRP )zR = min f(x)

s.t. x ∈ R,

is said to be a convexification of (NCP ) if

1. f(x) is a convex underestimator of f(x),

2. S ⊆ R and R is convex.

In Definition 1.12, we require a convexification to have both a convex objective

function and a convex feasible region. It therefore can be solved by a variety of algorithms.

Since LPs can be solved extremely efficiently, it is often helpful to require in Definition 1.12

that f(x) be linear and that R be polyhedral. If so, we refer to the resulting convexification

as a linearization. Since every convex set can be represented as an intersection of

(possibly infinitely many) half-spaces, a linearization can always be constructed from

a convexification. Linearizations have been typically preferred to convexifications in

commercial solvers (see Adjiman et al. [3], LINDO Systems Inc. [80], Sahinidis and

Tawarmalani [105], and Belotti et al. [26]) because they tend to be faster and algorithms

are more stable. An example of convexification for MILPs of the form

(MILP )

min cTx

s.t. Ax = b

x ∈ Z|I|+ × Rn−|I|

+ ,

is the linear program

(RMILP )

min cTx

s.t. Ax = b

x ∈ Rn+.

22

obtained dropping integrality restrictions on variables xj for j ∈ I. This relaxation

is called the LP relaxation of (MILP ). We will describe general methods to generate

convexifications for MILPs and MINLPs in Chapter 2. For detailed discussions about

convexification techniques, we refer the interested reader to the book by Tawarmalani and

Sahinidis [121].

1.3 Branch-and-Cut in MINLP

For a nonconvex MINLP problem where f and/or gi are nonconvex, finding

globally optimal solutions is a challenging problem that has attracted much attention.

Branch-and-bound is one of the methods described in Section 1.1.2 that have been

proposed for solving this problem. Branch-and-bound methods are implicit enumeration

techniques based on the divide-and-conquer strategy and the concept of convexification.

A globally optimal solution of the convexification is first obtained. If it satisfies the

conditions of Proposition 1.3, it is optimal for the problem. Otherwise, the relaxed

solution only yields a lower bound on z∗. When this happens, the feasible region is divided

into non-overlapping subsets for which stronger convex relaxations can be built. An

optimal solution to the initial problem can then be obtained by selecting the best among

the globally optimal solutions of the subproblems. Since subproblems are likely to be

nonconvex problems, globally optimal solutions of these subproblems are obtained by

applying the procedure recursively. As a result, a tree of subproblems is created that is

called branch-and-bound tree. There are three cases when the branch-and-bound search of

a current node is stopped, an operation known as fathoming of a node:

1. the relaxation is infeasible,

2. the objective value of the current relaxation is larger than the value of a known

feasible solution,

3. the solution of the relaxation is globally optimal for subproblems; see Proposition 1.3.

The branch-and-bound process terminates when all nodes are fathomed (i.e., when the

lower bound zL is equal to the upper bound zU). In MILP, this process is finite (i.e.,

23

zL = zU occurs in a finite number of steps) and is convergent (i.e., zU − zL → 0) when

variables are bounded. In MINLP, for a given tolerance ε > 0, the search process typically

terminates when zU − zl ≤ ε. Provided that convexification used in the tree is finitely

consistent, i.e., any unfathomed partition can be further refined at every iteration, the

branch-and-bound process can terminate after finitely many steps; see Horst and Tuy [69].

Land and Doig [75] in 1960 introduced the first branch-and-bound algorithm for

pure integer linear programs. Dakin [43] and Driebeek [47] extended it to mixed-integer

linear programming problems. Since then, branch-and-bound has become a general

solution method in MILP that has been successfully implemented in commercial software

such as CPLEX [40]. In MILP problems, branch-and-bound proceeds by recursively

solving LP relaxations of the problem (see Section 1.2.1). Since LP relaxations can be

weak, new linear inequalities derived from the problem structure are typically added

to cut off fractional solutions. These additional valid inequalities are called cuts or

cutting planes. The use of cuts is known to be one of the most important ingredients to

the efficient solution of MILP with branch-and-bound. The addition of cuts inside the

branch-and-bound framework yields a family of methods called branch-and-cut ; see Martin

[84].

Falk and Soland [51] introduced nonlinear branch-and-bound for continuous

global optimization. For factorable nonconvex problems, McCormick [85] proposed

a convexification scheme for factorable problems under the assumption that tight

convex and concave envelopes are known for the underlying univariate functions.

Ryoo and Sahinidis [103] introduced a branch-and-reduce algorithm that uses domain

reduction techniques during the process. Androulakis et al. [5] developed an αBB

branch-and-bound method that relies on the twice differentiable functions. Tawarmalani

and Sahinidis [122] introduced the idea of building and solving polyhedral-based

relaxations in branch-and-bound for global optimization and Tawarmalani and Sahinidis

[123] implemented this idea. Currently, nonlinear branch-and-bound methodologies have

24

been implemented in various global optimization software; see Adjiman et al. [3], Sahinidis

and Tawarmalani [105], LINDO Systems Inc. [80], and Belotti et al. [26].

Branch-and-cut is not a specific algorithm but a general framework since it relies

on four main components that can be adapted. These four components are: bounding

that obtains lower and upper bounds on the optimal value of relaxations, branching

that divides a problem into smaller subproblems, cutting that adds valid inequalities

to formulations, and domain reduction, also known as bound tightening, that reduces

the search region. A key component in the success of branch-and-cut algorithm is the

quality of bounds obtained from the relaxation. To obtain better bounds, it is necessary to

develop tighter convexifications. This is the ultimate goal of this thesis as we will discuss

more in Chapters 2 and 3. Next, we describe in more detail the branch-and-cut framework

to illustrate the setting in which our results are applied; see Figure 1-1. We discuss each of

its components in the following sections.

1.3.1 Bounding Scheme

In every branch-and-bound node, both lower and upper bounds on the optimal value

are computed and/or updated. Upper bounds are obtained from feasible solutions that are

found using some upper bounding procedures or heuristic algorithms. Lower bounds are

computed through the solution of a convexification of the problem.

1.3.2 Branching Scheme

For MILPs, dividing feasible regions into subproblems is simple. Assuming an LP

relaxation has been solved and the optimal solution x∗ is fractional, we can choose any

integer variable xi whose optimal value x∗i is fractional and then create two subproblems:

one obtained by adding the constraint xi ≤ bx∗i c and the other obtained by adding the

constraint xi ≥ bx∗i c+1. This scheme, called dichotomy branching, ensures that the current

LP solution does not survive in any of the subsequent convexifications of the subproblems

and therefore ensures that the branch-and-bound search progresses. We mention that

other branching schemes such as GUB branching or constraint branching can be used.

25

Input: Problem (P ) and set of integer variables IOutput: An optimal solution x∗ with optimal value z∗

Initialization;L ← {P}, x∗ ← ∅, z∗ ← +∞;

while L 6= ∅ doCheck termination criteria;Update list L;if zPi ≥ z∗ for Pi ∈ L then

L ← L \ {Pi};end

Node Selection;Select Pi ∈ L and let L ← L \ {Pi};

Domain Reduction;Tighten the bounds on variables of Pi;

Construct a convex relaxation RPi of Pi;while Cut Generation needs to be performed do

Obtain zRPi and xRPi by solving RPi;Pruning;by Infeasibility;by Bounds if zRPi ≥ z∗;by Global Feasibility;

Cut Generation;if ∃ a violated cut then

add to the formulation;end

endPrimal Heuristics;Branching;Choose a variable xj;Choose a branching point xb

j;Create subproblems: Pi− and Pi+;L ← L ∪ {Pi−, Pi+};

end

Figure 1-1. Branch-and-Cut framework

26

Observe that even when only applying dichotomy branching, algorithmic decisions must

be made about the selection of both the branching variable (i.e., which fractional variables

will be branched on) and the branching point; see Achterberg et al. [1], Linderoth and

Savelsbergh [79]. In MILP, while the latter is straightforward, the former is not and

different strategies might result in dramatically different trees.

Similar approaches can be used in MINLP. In the selection of branching variables,

integer variables typically take priority over continuous variables. Hence, if there are

integer variables with fractional values, then one of these variables is selected first for

branching. To select among several integer variables, standard MILP techniques are

used. Note that it could happen that x∗i has integer values for all i ∈ I, but x∗

i is not

feasible for the other relaxed constraints. Hence, a measure of infeasibility for solutions

is introduced in MINLP. To select a branching variable among continuous variables,

Tawarmalani and Sahinidis [122] propose to use violation transfer and Belotti et al. [26]

extend the reliability branching used in MILP. After the selection of branching variables,

the branching point can be chosen using several rules such as bisection rule, ω rule, or

other variants [103, 109, 116]. For bilinear programs, an alternative selection rule for the

branching point is provided in [116].

1.3.3 Cutting Scheme

Since the initial relaxation created at the root node is typically weak, it is important

to improve it by adding strong inequalities. In MILP, this can be done through the

addition of cutting planes that separate a fractional solution from the feasible region. We

will discuss strong valid inequalities for MILPs and will describe two well-known tools to

generate them in Section 2.2. Similarly, the performance of the branch-and-bound search

in MINLP can be improved if relaxations are tightened using strong inequalities. While

cuts must be linear inequalities in MILP, convex constraints can also be used in MINLP as

long as they are valid and improve bounds; see Tawarmalani and Sahinidis [123].

27

1.3.4 Domain Reduction

Domain reduction for a variable x is the process of reducing the interval [xl, xu]

where x is considered while guaranteeing that an optimal solution is not cut off. As the

search space is reduced through this procedure, relaxations obtained typically become

stronger. One such procedure is the optimality-based range deduction that uses the current

linearization to improve the bounds on variables; see Shectman and Sahinidis [109] and

Zamora and Grossmann [137]. It is typically used for auxiliary variables introduced in the

reformulation phase and applied only at the root node or up to a limited depth. On the

other hand, feasibility-based range deduction similar to interval propagation in Constraint

Programming is performed at all nodes of the tree; see Shectman and Sahinidis [109].

Domain reduction has also been widely used in MILP; see Savelsbergh [106]. Belotti et al.

[26] developed aggressive bounds tightening which is similar to probing techniques in MILP

[106, 120]. Reduced-cost bounds tightening, introduced for solving MILP problems [91], has

also been extended to MINLP by Ryoo and Sahinidis [103].

1.4 Outline of the Dissertation

In this thesis, we introduce new tools to improve the convexifications used in MINLP.

In particular, we study nonlinear sets that appear as relaxations of MINLP problems. The

overall structure of the thesis is as follows.

In Chapter 2, we give an overview of techniques that are used in integer programming

and global optimization to produce convexifications of nonconvex sets. We focus on

factorable relaxation techniques since they are most related to our work. We also describe

how to generate strong cutting planes for general MILP problems using disjunctive

programming and lifting techniques in Sections 2.2.1 and 2.2.2.

In Chapter 3, we motivate the problems that are addressed in this thesis. Then, we

provide formal problem statements for the following chapters.

In Chapter 4, we propose a convexification tool that constructs the convex hulls of

orthogonal disjunctive sets using convex extensions and disjunctive programming; see

28

Chapter 2 for an introduction to these techniques. We discuss the technical assumptions

under which this convexification tool can be used. In particular, we provide sufficient

conditions for establishing the convex extension property. The convexification tool is

then applied to obtain explicit convex hulls of various bilinear covering sets over the

nonnegative orthant. It is, in general, widely applicable to problems where variables do

not have upper bounds.

In Chapter 5, we study 0−1 mixed-integer bilinear covering sets to investigate

how bounds on the variables affect the derivation of cuts. We derive large families of

facet-defining inequalities via sequence-independent lifting techniques; see Chapter 2 for

an introduction to lifting techniques. We show that these sets have polyhedral structures

that are similar to those of certain single-node flow sets. In particular, we prove that the

facet-defining inequalities we develop generalize well-known lifted flow cover inequalities

from the integer programming literature.

In Chapter 6, we present a computational study that evaluates the strength of

lifted inequalities derived in Chapter 5. We first generalize the lifted inequalities of

Chapter 5 to a more general form of bilinear covering sets that include linear terms on

variables. This extension is necessary to account for the linear terms introduced during the

branch-and-bound process. We discuss implementations details and experimental results.

In Chapter 7, we summarize the main results of this thesis and conclude with

directions for future research.

29

CHAPTER 2CONVEX RELAXATIONS IN MILP AND MINLP

In this chapter, we describe methods to generate convex relaxations of MILPs and

MINLPs, focusing on the techniques that are most related to our work. In Section 2.1,

we describe how to build convex relaxations of nonconvex MINLP problems. Then, in

Section 2.2, we give an overview of how disjunctive programming and lifting techniques

can be used to generate improved formulations of MILPs. The tools described in

Sections 2.1 and 2.2 will be used in Chapters 4, 5, and 6.

2.1 Convexification Methods in MINLP

Constructing strong convex relaxations of nonconvex problems is a central problem in

developing branch-and-cut frameworks for nonconvex MINLPs. In this section, we describe

general convexification methods that are used in commercial global optimization solvers.

Note that, given a nonconvex problem of the form

min f(x)

s.t. gi(x) ≤ 0 ∀i ∈ M,

a simple convex relaxation can be obtained by relaxing each inequality into a convex

constraint and replacing f with a convex underestimator. In particular, if gi(x) is a convex

underestimator of gi(x) and f(x) is a convex underestimator of f(x), the relaxation

min f(x)

s.t. gi(x) ≤ 0 ∀i ∈ M.

is a convex optimization problem. Among convex underestimators, convex envelopes are

strongest. Therefore, the ability of constructing convex envelopes of nonlinear functions is

an essential ingredient in the derivation of strong convexifications of MINLPs.

30

2.1.1 Convex Envelopes and Convex Extensions

In the global optimization literature, convex envelopes have been developed for special

classes of functions over special polytopes. For detailed discussions, we refer the interested

reader to the books of Horst and Tuy [69] and Tawarmalani and Sahinidis [121].

First, we describe how convex envelopes of sums of functions can be obtained from

the sum of convex envelopes of the individual functions.

Theorem 2.1 (Al-Khayyal and Falk [4]). Let Q =∏r

j=1Qj be the cartesian product of r

compact nj-dimensional rectangles Qj for j = 1, . . . , r satisfying∑r

j=1 nj = n. Assume that

f : Q 7→ R is of the form f(x) =∑r

j=1 fj(xj), where fj : Qj 7→ R is lower semi-continuous

on Qj for j = 1, . . . , r. Then, the convex envelope of f on Q is obtained as the sum of the

convex envelopes of fj on Qj, i.e.,

convenv(f) =r∑

j=1

convenv fj(xj).

Next, we present two fundamental results developed by Falk and Hoffman [50] and

Horst [66].

Theorem 2.2. Let Q be a polytope with vertices v1, . . . , vk. Let f : Q 7→ R be a concave

function on Q. Then, the convex envelope convenv(f) of f can be computed as

convenv(f(x)) = minλ

k∑j=1

λjf(vj)

s.t.

k∑j=1

λjvj = x,

k∑j=1

λj = 1,

λj ≥ 0, j = 1, . . . , k.

The following result immediately follows.

Theorem 2.3. Let Q be a n-simplex generated by the vertices v0, v1, . . . , vn, and let

f : Q 7→ R be a concave function on Q. Then, the convex envelope of f is the affine

31

function

φ(x) = aTx+ b, where a ∈ Rn, b ∈ R,

that is uniquely determined by the system of linear equations

f(vi) = aTvi + b, for i = 0, 1, . . . , n.

It follows from Theorem 2.3 that it is especially easy to construct the convex

envelopes of univariate concave functions f : R 7→ R over an interval [l, u]. This is

because the graph of the convex envelope is simply the line segment connecting the points

(l, f(l)) and (u, f(u)).

Among the set of all multivariate functions, multilinear functions are of particular

importance as we will see in Section 2.1.2. Convex envelopes of multilinear functions were

studied by Crama [41] and Rikun [101]. We next give a formal definition of a multilinear

function.

Definition 2.1 (Multilinear). A function f(x1, . . . , xk) is said to be multilinear if, for each

i = 1, . . . , k, f(x1, . . . , xi, . . . , xk) is a linear function of the vector xi when the components

of the other k − 1 vectors are fixed to xj = xj for j 6= i.

Rikun [101] studied multilinear functions f(x) of x = (x1, . . . , xk) defined on the

cartesian product of polytopes where x ∈ Q =∏k

j=1Qj, xj ∈ Qj ⊂ Rnj for j = 1, . . . , k.

Definition 2.2 (Associated Affine Function). Let f(x) be a multilinear function defined

on∏k

j=1Rnj . For the function f(x) and any given point ξ = (ξ1, . . . , ξk) where ξj ∈ Rnj for

j = 1, . . . , k, the associated affine function fξ(x) is defined as:

fξ(x) =k∑

j=1

f(ξi, . . . , ξj−1, xj, ξj+1, . . . , ξk)− (k − 1)f(ξ). (2–1)

Rikun [101] showed that the convex envelope of a multilinear function over the

cartesian product of polytopes is polyhedral.

Theorem 2.4 (Rikun [101]). Let f(x1, . . . , xn) : Q 7→ Rn be a multilinear function defined

on the cartesian product of polytopes Q =∏k

j=1Qj where xj ∈ Qj ⊂ Rnj for j = 1, . . . , k.

32

Let ξ = (ξ1, . . . , ξk) be a vertex of Q, i.e., ξj ∈ vert(Qj) and the associated affine function

(2–1) satisfy

fξ(x) ≤ f(x), ∀x ∈ vert(Q).

Then, the affine function fξ(x) is an element of the convex envelope of f(x).

To facilitate the construction of convex envelopes of nonconvex functions, Tawarmalani

and Sahinidis [120] introduced the notion of convex extensions. This notion generalizes a

similar concept introduced by Crama [41].

Definition 2.3 (Convex Extensions). Let S be a convex set and X ⊆ S. A convex

extension of a function φ : X 7→ R over S is defined as a convex function ψ : S 7→ R such

that φ(x) = ψ(x) for all x ∈ X.

Note that convex extensions are neither always constructible nor unique. The

following result describes conditions under which a convex extension can be constructed.

Theorem 2.5 (Tawarmalani and Sahinidis [120]). A convex extension of a function

φ : X 7→ R over a convex set S ⊇ X can be constructed if and only if

φ(x) ≤ min

n∑j=1

λjφ(xj)

∣∣∣∣∣∣∣∣∣∣

∑nj=1 λjxj = x,

∑nj=1 λj = 1, xj ∈ X, ∀j = 1, . . . , n

λj ∈ [0, 1], ∀j = 1, . . . , n

for all x ∈ X.

Note that for complicated functions, finding convex envelopes might be difficult.

Next, we describe a general scheme that produces convex relaxations of factorable

functions.

2.1.2 Reformulation and Relaxation

Convexifications are often obtained in two steps: reformulation and relaxation. The

first step converts the original problem into an equivalent formulation that is easier to

study; the second step constructs a convex relaxation by relaxing nonconvex terms in the

reformulated problem.

33

First, we describe a general reformulation scheme for functions that are factorable; see

Definition 1.1. In fact, factorable functions can be reformulated by introducing auxiliary

variables using the recursive algorithms presented in Tawarmalani and Sahinidis [121].

To illustrate the idea, consider a factorable function f(x) given as the following sum of

product of univariate functions, i.e.,

f(x) =2∑

j=1

2∏

k=1

hjk(x).

In this case, we can reformulate f(x) by introducing auxiliary variables yj to represent

each term of the summation and auxiliary variables yjk to represent the factors of the

product respectively, i.e.,

f(x) = y,

y =2∑

j=1

yj, (2–2)

yj =2∏

k=1

yjk, ∀j = 1, 2 (2–3)

yjk = hjk(x), ∀j = 1, 2,∀k = 1, 2. (2–4)

Note that this reformulation lifts the original problem into a higher dimensional

space by introducing auxiliary variables. After the reformulation phase, we observe that

relaxation schemes are only needed for sums and products of two variables, appearing in

(2–2) and (2–3) respectively, as well as for univariate functions appearing in (2–4). For

all of the terms, convex relaxations can be constructed using factorable programming

techniques rooted in the work of McCormick [85].

Definition 2.4 (McCormick Relaxations [89]). The relaxations of a factorable function

that are formed via recursive application of rules for the relaxation of univariate composi-

tion, binary multiplication, and binary addition from convex and concave relaxations of the

univariate intrinsic functions, without the introduction of auxiliary variables, are said to be

McCormick Relaxations.

34

Since the sum of convex functions is convex, convex relaxations for the sum of two

functions can be easily constructed as follows.

Theorem 2.6 (Relaxation of Sums [89]). Let S ⊆ Rn be a nonempty convex set,

and g, g1, g2 : S 7→ R such that g(x) = g1(x) + g2(x). Let gu1 , g

o1 : S 7→ R be a

convex underestimator and concave overestimator of g1 on S, respectively. Similarly,

gu2 , go2 : S 7→ R be a convex underestimator and concave overestimator of g2 on S,

respectively. Then, gu, go : S 7→ R, defined as,

gu(x) = gu1 (x) + gu2 (x), go(x) = go1(x) + go2(x),

are a convex and concave relaxation of g(x) on S, respectively.

However, relaxation techniques for products of two functions is not straightforward

as shown in the following proposition. This results follows from the convex and concave

envelopes of a bilinear function developed by McCormick [85].

Theorem 2.7 (Relaxation of Products [89]). Let S ⊆ Rn be a nonempty convex set,

and g, g1, g2 : S 7→ R such that g(x) = g1(x)g2(x). Let gu1 , g

o1 : S 7→ R be a convex

underestimator and concave overestimator of g1 on S, respectively. Similarly, gu2 , go2 :

S 7→ R be a convex underestimator and concave overestimator of g2 on S, respectively.

Furthermore, let gL1 , gU1 , g

L2 , g

U2 ∈ R such that

gL1 ≤ g1(x) ≤ gU1 ∀x ∈ S and gL2 ≤ g2(x) ≤ gU2 ∀x ∈ S

Consider the following intermediate functions, α1, α2, β1, β2, γ1, γ2, δ1, δ2 : S 7→ R:

α1(x) = min{gL2 gu1 (x), gL2 go1(x)}, α2(x) = min{gL1 gu2 (x), gL1 go2(x)},

β1(x) = min{gU2 gu1 (x), gU2 go1(x)}, β2(x) = min{gU1 gu2 (x), gU1 go2(x)},

γ1(x) = max{gL2 gu1 (x), gL2 go1(x)}, α2(x) = max{gU1 gu2 (x), gU1 go2(x)},

δ1(x) = max{gU2 gu1 (x), gU2 go1(x)}, α2(x) = max{gL1 gu2 (x), gL1 go2(x)}.

35

Then, α1, α2, β1, and β2 are convex on S, while γ1, γ2, δ1, and δ2 are concave on S.

Moreover, gu, go : S 7→ R, defined as

gu(x) = max{α1(x) + α2(x)− gL1 g

L2 , β1(x) + β2(x)− gU1 g

U2

},

go(x) = min{γ1(x) + γ2(x)− gU1 g

L2 , δ1(x) + δ2(x)− gL1 g

U2

},

are convex and concave relaxations of g on S, respectively.

Al-Khayyal and Falk [4] prove that McCormick relaxation constructs the convex and

concave envelopes of bilinear terms, presented as follows.

Theorem 2.8 (Al-Khayyal and Falk [4]). Consider a bilinear term yiyj over the hypercube

H2 := [yli, yui ]× [ylj, y

uj ]. Then,

convenv(yiyj) = max{yliyj + yljyi − yliy

lj, y

ui yj + yuj yi − yui y

uj

}

and

concenv(yiyj) = min{yui yj + yljyi − yui y

lj, y

liyj + yuj yi − yliy

uj

}.

McCormick [86] showed that a tight relaxation of a composition of functions

h(g(x)) can be built using convex and concave envelopes as the underestimators and

overestimators of h(yg). Relaxation methods for multilinear functions over a hypercube

have been proposed by Rikun [101] and Ryoo and Sahinidis [104]. Different relaxation

schemes for the fractional functions are developed by Tawarmalani and Sahinidis [119]

and Tawarmalani et al. [114, 115]. For detailed specification of recursive reformulation

algorithms, we refer the interested reader to the book of Tawarmalani and Sahinidis [121].

Assuming that all variables are bounded, a univariate convex function f(xj) where

xj ∈ [xlj, x

uj ], is overestimated by the line connecting the points

(xlj, f(x

lj))and

(xuj , f(x

uj ))

while f(xj) is underestimated by the function itself. Hence, a convex outer-approximator

of any convex function can be constructed by combining these estimators.

If a univariate function f(xj) is convex and differentiable over xj ∈ [xlj, x

uj ], then for

any x ∈ [xlj, x

uj ], a valid linear inequality can be obtained using the gradient. For a given

36

gradient ∂f∂xj

(x) of f(xj) at x, the gradient inequality

y ≥ f(x) +∂f

∂xj

(x)(xj − x), (2–5)

is valid for all xj ∈ [xlj, x

uj ]. Therefore, we can build linear relaxations using outer-

approximations of differentiable univariate functions such as exp(x), log(x), sin(x), and

cos(x).

2.2 Cutting Plane Techniques for Mixed-Integer Linear Program (MILP)

For MILPs, we mentioned in Section 1.2.2 that LP relaxations are often used as

convexifications. In this section, we discuss techniques to improve LP relaxations of

MILPs. We consider mixed-integer linear programs of the form

(MILP )min cTx

s.t. x ∈ S

where I ⊆ {1, . . . , n} and

S :={x ∈ Z|I|

+ × Rn−|I|+

∣∣∣ Ax ≤ b}.

We first present a basic result about the convex hull of S.

Theorem 2.9 (Meyer [87]). The convex hull of S, where A ∈ Qm×n and b ∈ Qm, is a

polyhedron whose extreme points lie in S.

This result together with Theorem 1.2 implies that every MILP problem can be

reformulated as a linear program, provided that A and b are rational. This is particularly

interesting since LPs can be solved efficiently as we mentioned in Section 1.2.1. While the

linear program

min{cTx

∣∣∣ x ∈ conv(S)}

always has an optimal solution that is optimal for (MILP ), it is typically difficult to

obtain a full linear description of conv(S). Nevertheless, we are interested in finding

partial descriptions of conv(S). Studying the polyhedron conv(S) requires a good

37

understanding of what inequalities aix ≤ bi are most important in the description of

conv(S). This motivates the introduction of the following definitions.

Definition 2.5 (Valid Inequality). Let X ⊆ Rn. The inequality αTx ≤ δ is said to be valid

for X if it is satisfied by all points of X, i.e., αT x ≤ δ ∀x ∈ X.

Definition 2.6 (Face). If αTx ≤ δ is a valid inequality for a polyhedron Q, then

F = Q ∩{x ∈ Rn

∣∣∣ αTx = δ}

is said to be a face of Q. We also say that αTx ≤ δ represents or defines the face F .

In order for an inequality to be helpful in the description of a polyhedron, the face

it defines should be large. To measure the dimension of a polyhedron, we introduce the

following definitions

Definition 2.7 (Affine Independence). Vectors x1, . . . , xk in Rn are said to be affinely

independent if the unique solution to the system∑k

j=1 λjxj = 0,

∑kj=1 λj = 0 is λj = 0 for

all j = 1, . . . , k.

Definition 2.8 (Dimension). A polyhedron Q has dimension d, which we denote by

dim(Q) = d, if the maximum number of affinely independent points in Q is d+ 1.

Definition 2.9 (Facet). A face F of a polyhedron Q is said to be a facet of Q if dim(F ) =

dim(Q) − 1. A valid inequality αTx ≤ δ that induces a facet of Q is called a facet-defining

inequality for Q, or facet for short .

We mention that among all inequalities in the description of a full-dimensional

polyhedron, only those that define facets are necessary. We refer the interested reader to

Nemhauser and Wolsey [91] for a detailed exposition.

Proposition 2.1. Let Q be a full-dimensional polyhedron defined by (ai)Tx ≤ bi for

i ∈ M . Let MF be the subset of M containing the indices of facet-defining inequalities for

Q. Then,

Q ={x ∈ Rn

∣∣∣ (ai)Tx ≤ bi ∀i ∈ MF

}.

38

Therefore, when studying conv(S), it is sufficient to consider inequalities that are

facet-defining.

We will describe in Sections 2.2.1 and 2.2.2 techniques to construct valid and

facet-defining inequalities for MILPs. We note that, in practice, the question is not

only how to generate inequalities but also how to use them. In fact, the linear description

of conv(S) can have exponentially many inequalities. It is therefore typically impractical

to solve the corresponding linear programs. In order to overcome this difficulty, cutting

plane methods are typically used. The first cutting plane algorithm for solving MILPs was

described in 1958 by Gomory [57] for the case where |I| = n. This algorithm generalized

the more dedicated polyhedral approach devised by Danzig et al. [45] for the Traveling

Salesman Problem.

In cutting plane algorithms, we solve a sequence of linear programs that differ from

each other by the addition of one or more valid inequalities. More precisely, we first solve

the LP relaxation of (MILP ) to global optimality. The corresponding optimal solution x0

is typically fractional since the LP relaxation does not impose integrality on the variables.

We obtain a tightened formulation by adding inequalities to the LP relaxation.

Definition 2.10 (Cutting Plane). An inequality αTx ≤ δ where α ∈ Rn and δ ∈ R is

said to be a cutting plane for (MILP ), or cut for short, if it is valid for S and there exists

solution x0 of the LP relaxation such that αTx0 > δ.

It is clear that, for a cut αTx ≤ δ to improve the current LP relaxation of an MILP,

it must cut x0 off, i.e., αTx0 > δ. Given a fractional solution x0 of the LP relaxation,

the problem of finding such a violated cut is known as separation problem. It is typically

difficult to solve separation problems exactly since separation was shown to be as hard as

optimization; see Grotschel et al. [59]. Note that the proof relies on the ellipsoid algorithm

described earlier in Section 1.2.1. As a result, heuristics are often used for separation. If a

cut is found, it is added and the process is iterated. Otherwise, the process is terminated.

The basic structure of cutting plane algorithms is described in Figure 2-1. For detailed

39

textbook descriptions, we refer the interested readers to Nemhauser and Wolsey [91] and

Schrijver [108].

Input: Problem (MILP )Output: An optimal solution x∗

Initialization;i = 0, Q0 ← LP relaxation of (MILP );

Obtain x0 by solving the LP min{cTx | x ∈ Q0};while xi is fractional do

Separation;if there exists a cutting plane αT

i x ≤ δi separating xi from S thenQi+1 ← Qi ∩ {x | αT

i x ≤ δi};i ← i+ 1;

elseterminate

endObtain xi by solving the linear program min{cTx | x ∈ Qi};

endx∗ ← xi;

Figure 2-1. Cutting plane algorithm

Although the algorithm of Figure 2-1 can terminate without finding an integer

optimal solution for (MILP ), the formulation Qi obtained after the addition of cuts

provides a strengthened formulation for which branch-and-bound is likely to be more

efficient. In practice, there are many tradeoffs to consider between the running time of a

separation procedure and the quality of the cutting planes it produces when designing a

cutting plane algorithm.

2.2.1 Disjunctive Programming

In this section, we give an overview of disjunctive programming techniques and how

they can be used to generate strong cuts for MILPs. Disjunctive programming can be

succinctly described as the study of optimization problems defined over unions of sets,

typically polyhedra. Even when the sets are convex, their union is typically not. One of

the main focuses of disjunctive programming is to study the convex hull of such unions.

40

The foundations of disjunctive programming were laid by Balas in a technical report in

1974. This report was published 24 years later in Balas [17].

Disjunctive programming is directly applicable to MILPs since fixing integer variables

to all values they can take transforms these MILPs into disjunctive programs. As a

result, disjunctive programming techniques have been used to derive strong relaxations

and cutting planes for various problems; see Balas [15, 16]. In particular, Balas et al.

[20] implemented disjunctive programming techniques for mixed 0−1 programs in a

branch-and-cut framework. They specialize generic disjunctive programming techniques to

show how to generate lift-and-project cuts through the solution of a cut generation linear

program (CGLP), and develop strengthened disjunctive cuts.

Stubbs and Mehrotra [113] generalized the disjunctive programming techniques of

Balas et al. [20] to 0−1 mixed convex programming problems inside of a branch-and-cut

framework. Ceria and Soares [31] also provided algebraic representations and solution

procedures for disjunctive convex programming.

Next, we describe some important results in disjunctive programming. We limit our

presentation to union of polyhedra. We first review the basic concept of projection that

will be used to relate convex hulls of sets in the space of their original variables and their

higher dimensional representations obtained by disjunctive programming. We refer to

Balas [18] and Cornuejols [37] for more detailed discussions.

Definition 2.11 (Projection). Given a polyhedron Q ⊆ Rn × Rr, the projection of Q onto

the subspace of Rn defined by the x variables is defined as

projx

Q ={x ∈ Rn

∣∣∣ (x, y) ∈ Q for some y ∈ Rr}.

The projection of a polyhedron Q can be obtained using Fourier-Motzkin Elimination;

see Fourier [54]. This method recursively eliminates the variables yi one at a time, as

presented in the following proposition.

41

Proposition 2.2 (Fourier-Motzkin Elimination). Given a polyhedron

Q =

{(x, y) ∈ Rn × R

∣∣∣n∑

j=1

aijxj + biy ≤ di,∀i = 1, . . . ,m

},

the projection of Q onto x satisfies

projx

Q =

x ∈ Rn

∣∣∣∣∣∣∣

dk−∑n

j=1 akjxj

bk≤ dl−

∑nj=1 aljxj

bl, ∀k ∈ M−,∀l ∈ M+,

∑nj=1 aijxj ≤ di, ∀i ∈ M0

,

where M+ = {i ∈ M | bi > 0}, M− = {i ∈ M | bi < 0} and M0 = {i ∈ M | bi = 0}.The projection can be also obtained using the concept of projection cone as described

below.

Proposition 2.3 (Cornuejols [37]). Let

Q ={(x, y) ∈ Rn × Rr

∣∣∣ Ax+By ≤ d}.

Then,

projx

Q ={x ∈ Rn

∣∣∣ (uTA)x ≤ uTd, ∀u ∈ E}

where E is the set of extreme rays of the projection cone

C :={u ∈ Rr

∣∣∣ uTB = 0, u ≥ 0}.

Definition 2.12 (Disjunctive sets). Given polyhedra Qi ={x ∈ Rn

∣∣∣ Aix ≥ bi}

for i ∈ M ,

we define the disjunctive set⋃

i∈M Qi as

Q =

{x ∈ Rn

∣∣∣∨i∈M

(Aix ≥ bi)

}. (2–6)

Expression (2–6) is known as the disjunctive normal form of the disjunctive program.

Using operations described in Balas [17], the disjunctive set Q can also be expressed as

Q =

x ∈ Rn

∣∣∣ Ax ≥ b,∨

h∈Mj

(dh ≥ dh0),∀j = 1, . . . , t

, (2–7)

which is called the conjunctive normal form.

42

Balas [17] describes how to obtain the convex hull of a disjunctive set. We present

this result in the following theorem.

Theorem 2.10 (Balas [17]). Given polyhedra Qi ={x ∈ Rn

∣∣∣ Aix ≥ bi}

6= ∅, ∀i ∈ M ,

define

Q :=

(x, (yi, yi0)i∈M

)

∣∣∣∣∣∣∣∣∣∣∣∣∣

x−∑i∈M yi = 0,

Aiyi − biyi0 ≥ 0,

yi0 ≥ 0,∑

i∈M yi0 = 1

,

where (yi, yi0) ∈ Rn+1 for i ∈ M . Then,

QM := cl conv(⋃i∈M

Qi) = projx

Q.

Further,

1. if x∗ is an extreme point of QM , then(x, (yi, yi0)i∈M

)is an extreme point of Q where

x = x∗, (yk, yk0) = (x∗, 1) for some k ∈ M , and (yi, yi0) = (x∗, 1) for all i ∈ M \ {k}.2. if

(x, (yi, yi0)i∈M

)is an extreme point of Q, then yk = x = x∗ and yk0 = 1 for some

k ∈ M , and x∗ is an extreme point of QM .

Theorem 2.10 gives a description of the convex hull of⋃

i∈M Qi in a higher dimensional

space. In order to obtain the convex hull QM in the original space of variables of Qis, we

must project Q onto the x space. Theorem 2.11 describes how this projection is obtained.

This result follows from Proposition 2.3.

Theorem 2.11 (Balas [17]). projx(Q) = {x ∈ Rn | αx ≥ β, ∀(α, β) ∈ W0}, where

W0 ={(α, β) ∈ Rn+1

∣∣∣ α = uiAi, β ≤ uibi, for some ui ≥ 0,∀i ∈ M}.

The higher dimensional representation also allows the derivation of facets of QM as

described in the following theorem.

Theorem 2.12 (Balas [17]). Assume that QM is full-dimensional. The inequality αx ≥ β

defines a facet of QM if and only if (α, β) is an extreme ray of the cone W0.

43

Given a point x /∈ QM , it is often necessary to derive a disjunctive cut αx ≥ β valid

for QM that cuts off x. This problem is equivalent to choosing coefficients (α, β, u) in W0

that minimize αx− β. This gives rise to Problem (2–8) commonly known as cut generating

LP. Note that in (2–8) we added the normalization constraint,∑

i∈M eTui = 1, to make

the problem bounded:

min αx− β

α = uiAi ∀i ∈ M,

β ≤ uib ∀i ∈ M, (2–8)

ui ≥ 0 ∀i ∈ M,∑i∈M

eTui = 1.

A disjunctive set is called facial if every inequality in (2–7) defines a face of Q, the

polyhedron defined by the constraints Ax ≥ b. An interesting feature of facial disjunctive

programs is that they can be sequentially convexified as described next.

Theorem 2.13 (Balas [17]). Let

D :=

x ∈ Rn

∣∣∣ Ax ≥ b,∨

h∈Mj

(dhx ≥ dh0), j = 1, . . . , t

,

where |Mj| ≥ 1 for j = 1, . . . , t and D is facial. Define

Q0(= Q) :={x ∈ Rn

∣∣∣ Ax ≥ b},

and for j = 1, . . . , t,

Qj := conv

Qj−1

⋂x

∣∣∣∨

h∈Mj

(dhx ≥ dh0)

. (2–9)

Then, Qt = cl conv(D).

Theorem 2.13 shows that, in some cases, it is sufficient to consider the disjunctions

sequentially rather than simultaneously to obtain the convex hulls.

44

To illustrate that disjunctive programming techniques can be helpful in creating

good convexification in integer programming, we describe its application to 0−1 integer

programming. A thorough description including relations to Lovasz and Schrijver [82]

and Sherali and Adams [110] is given in Balas et al. [20]. This variant of disjunctive

programming is commonly referred to as lift-and-project ; see Balas et al. [20]. For each

variable xj for j = 1, . . . , n, the current formulation is lifted into a higher dimensional

space Rn+p+q where it is tightened. Then, this strengthened formulation is projected back

onto the original space Rn+p, thus defining an improved formulation for S. After the last

variable is considered, the convex hull is obtained.

More precisely, consider the problem

min cTx

(BMILP ) s.t. Ax ≥ b,

xj ∈ {0, 1} ∀j = 1, . . . , n,

xj ∈ R+ ∀j = n+ 1, . . . , n+ r,

where integer variables can only take the values 0 or 1. We define

Q :={x ∈ Rn+r

+

∣∣∣ Ax ≥ b}

and denote the set of feasible solutions of (BMILP ) by

S :={x ∈ {0, 1}n × Rr

+

∣∣∣ Ax ≥ b}.

We assume that Ax ≥ b are obtained by adding −xj ≥ −1 for j = 1, . . . , n to Ax ≥ b

and that Ax ≥ b does not include xj ≥ 0 for j = 1, . . . , n. Clearly, the set S can be

reformulated as

S :=

x ∈ Rn+r

+

∣∣∣∣∣∣∣Ax ≥ b,

(−xj ≥ 0) ∨ (xj ≥ 1) ∀j = 1, . . . , n

,

45

which shows its relation to disjunctive programming. Since this problem is facial, its

convex hull can be obtained using Theorem 2.13. Particularly in this case, the jth step

(2–9) can be obtained as

Qj = projx

(x, x0, x1, y0, y1) ∈ R3n+ × R2

+

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

Aj−1x0 ≥ bj−1y0,

−x0j ≥ 0,

Aj−1x1 ≥ bj−1y1,

x1j ≥ y1,

x0 + x1 = x,

y0 + y1 = 1,

x, x0, x1, y0, y1 ≥ 0

.

where Qj−1 = {Aj−1x ≥ bj−1}. Denote the jth unit vector by ej. Using the projection cone

approach described in Proposition 2.3, we obtain that Qj is defined by inequalities αx ≥ β

where (α, β, u, u0, v, v0) are feasible solution to

α −uAj−1 +u0ej ≥ 0,

α −vAj−1 +v0ej ≥ 0,

β −ubj−1 ≤ 0, (2–10)

β −vbj−1 −v0 ≤ 0,

u, u0, v, v0 ≥ 0,

which is an expression of Theorem 2.11. The inequality αx ≥ β is called a lift-and-project

inequality. Note that lift-and-project inequalities are special type of split inequalities [36],

derived from the split disjunction xj ≤ 0 or xj ≥ 1. For details, see Balas et al. [20, 21].

2.2.2 Lifting

In this section, we describe a technique known as lifting and review how it has been

used to generate strong valid inequalities for MILPs. Deriving facet-defining inequalities

for the convex hull of feasible solutions to a MILP with many variables is typically

difficult. However, when a subset of variables is fixed to some values such as their lower

46

or upper bounds, it might be easier to derive a strong valid inequality. We refer to a

nontrivial inequality valid for a restricted set as a seed inequality. Lifting is the process

of constructing progressively, from a seed inequality valid for a lower dimensional set, an

inequality valid for a higher dimensional set. Gomory [58] first introduced the concept of

lifting in the context of the group problem. The technique was refined by Padberg [95] and

Wolsey [130]; see also Balas [14], Hammer et al. [63], Padberg [96], Wolsey [131], Zemel

[138], and Balas and Zemel [23].

Lifting is generally performed sequentially. Crowder et al. [42] and Gu et al. [60]

successfully used sequential lifting in a branch-and-cut framework for solving 0−1 integer

programs with cover inequalities. For 0−1 integer programs, Wolsey [132] proved that, if

the lifting function is superadditive, lifting coefficients are independent of the lifting order;

see Section 2.2.2.2. Gu et al. [62] applied sequence-independent lifting to mixed-integer

programs. Marchand and Wolsey [83] also used superadditive lifting for 0−1 knapsack

problems with a single continuous variable and Richard et al. [98] developed a general

lifting theory for continuous variables. Recently, lifting has also been used to obtain

inequalities for special-purpose global optimization problems; see de Farias et al. [46],

Vandenbussche and Nemhauser [128], and Atamturk and Narayanan [10]. A general

lifting theory for nonlinear programming is described in Richard and Tawarmalani [100].

However, the application of lifting techniques in MINLPs remains limited.

2.2.2.1 Sequential lifting

Although lifting can be used for general MILPs and for nonlinear programs, we

describe it only for the 0−1 knapsack polytope:

K =

{x ∈ {0, 1}n

∣∣∣∑j∈N

ajxj ≤ d

}

where |N | = n since the ideas extend to more general settings. Let N ′ ⊆ N and

v ∈ {0, 1}n. To represent the restricted set where some of the variables xj are fixed to 0 or

47

1, we define

K(N ′,v) ={x ∈ K

∣∣∣ xj = vj ∀j ∈ N ′}.

By selecting N ′ to be a larger and larger subset of N , we can change conv(K(N ′,v)) into

a polyhedron whose dimension is as small as we want. We note that one might think of

fixing the variables to some values between lower and upper bounds. In this case, however,

Atamturk [8] and Richard et al. [98] show that it is typically not possible to perform

lifting.

Often, it is easy to find a facet-defining inequality for low-dimensional polyhedra.

Assume therefore that∑

j∈N\N ′αjxj ≤ δ (2–11)

is a valid inequality for K(N ′,v). Assume without loss of generality that N ′ = {1, . . . , p}where p ≤ n. Taking (2–11) as the seed inequality, we convert (2–11) into an inequality

globally valid for conv(S) by lifting variables xj that were fixed to vj for j ∈ N ′. We can

perform lifting one variable xj at a time in some predefined order such as j = 1, . . . , p.

This approach is known as sequential lifting and is the most commonly used form of

lifting. We mention however that it can sometimes be beneficial to lift several variables

xj for some j ∈ N ′ at the same time; see Zemel [138] and Gu et al. [60]. This variant of

lifting is called simultaneous lifting.

Assume that the variables x1, . . . , xi−1 have already been lifted and

i−1∑j=1

αj(xj − vj) +∑

j∈N\N ′αjxj ≤ δ (2–12)

is valid for K(N ′ \ {1, . . . , i − 1},v). Lifting the variable xi for i ∈ N ′ in the inequality

(2–12) amounts to deriving a coefficient αi for which the lifted inequality

αi(xi − vi) +i−1∑j=1

αj(xj − vj) +∑

j∈N\N ′αjxj ≤ δ (2–13)

48

is valid for K(N ′ \ {1, . . . , i},v). To find αi, we define the lifting function

Φi(a) = δ− maxi−1∑j=1

αj(xj − vj) +∑

j∈N\N ′αjxj

s.t.

i−1∑j=1

aj(xj − vj) +∑

j∈N\N ′ajxj ≤ d− a (2–14)

xj ∈ {0, 1} ∀ j ∈ {1, . . . , i− 1} ∪N \N ′

associated with the inequality (2–12).

Theorem 2.14 (Wolsey [130]). Assume that the optimization problem defining Φ(ai) is

feasible. Inequality (2–13) is valid for K(N ′ \ {1, . . . , i},v) if

αi ≤

Φi(ai) if vi = 0,

−Φi(−ai) if vi = 1.

Moreover, if

1. (2–12) defines a face of conv(K(N ′ \ {1, . . . , i− 1},v)) of dimension k, and

2. αi = Φi(ai) when vi = 0 or αi = −Φi(−ai) when vi = 1, then (2–13) defines a face of

conv(K(N ′ \ {1, . . . , i},v)) of dimension at least k + 1.

Theorem 2.14 describes how to sequentially lift binary variables inside of 0−1

knapsack constraints. Lifting for general integer variables was used in Ceria et al. [30].

Lifting for continuous variables was first used by Marchand and Wolsey [83] where the

authors lift a single continuous variable without upper bounds inside a 0−1 mixed-integer

knapsack set. Richard et al. [98] proposed a general theory for the lifting of multiple

continuous variables with bounds.

We observe in Theorem 2.14 that a different lifting function Φi(a) must be computed

to determine the lifting coefficient of each lifted variable. In general, computing the lifting

function (2–14) even at a single point can be computationally time-consuming. Some of

these difficulties disappear when the lifting function is well-structured.

49

2.2.2.2 Sequence-independent lifting

To improve computational efficiency of sequential lifting, Wolsey [132] introduced the

concept of sequence-independent lifting. This method reduces the computational burden

associated with lifting by identifying conditions under which the lifting function does not

change during the various stages of lifting.

Definition 2.13 (Superadditive). Let Φ : W ⊆ R 7→ R. The function Φ is superadditive

over W if

Φ(w1) + Φ(w2) ≤ Φ(w1 + w2) for all w1, w2, w1 + w2 ∈W.

For 0−1 integer programs, Wolsey [132] proved that, if a lifting function is superadditive,

lifting coefficients are independent of the lifting order. Gu et al. [62] generalized the

concept of sequence-independent lifting to 0−1 mixed-integer programs. Atamturk [8]

generalized these results to general mixed-integer programs.

Theorem 2.15 (Gu et al. [62]). If the lifting function Φi(w) is superadditive over R, then

Φi(w) = Φi+1(w).

A superadditive lifting function is useful for deriving lifted inequalities efficiently.

Unfortunately, lifting functions are not always superadditive. For these situations, Gu

et al. [62] proposed to use superadditive approximations of the lifting function. Further,

they identify validity, dominance, and maximality to be common properties of good

superadditive approximations. Sequence-independent lifting has been used to derive strong

valid inequalities for various problems; see Marchand and Wolsey [83], Gu et al. [61],

Atamturk and Rajan [11], and Atamturk [7].

To lift multiple bounded continuous variables, Richard et al. [99] introduced the

concept of superlinear lifting that is a natural counterpart to superadditive lifting for

integer variables. We refer the interested reader to Richard et al. [99].

50

CHAPTER 3MOTIVATION AND RESEARCH STATEMENTS

3.1 Motivation

When comparing state-of-the-art solvers, it can be readily observed that solving

MINLPs to globally optimality requires more computational time than solving MILPs.

This is because traditional convexification methods do not always construct strong convex

relaxations. As discussed in Chapter 2, currently prevalent convexification techniques

derive convex relaxations of nonconvex MINLP problems by relaxing inequalities of the

form g(x) ≥ r with g(x) ≥ r, where g(x) is a concave overestimator of the function g(x).

Tawarmalani and Sahinidis [121] discuss how tight overestimators for various kinds of

functions can be constructed to produce such relaxations. However, the derived relaxation

can be weak because these methods do not use right-hand-side information during the

construction of the convex relaxations.

As an illustrative example, consider the simple set S defined as

S ={(x, y, z) ∈ R3

+

∣∣∣ xy + z ≥ r},

where r > 0. It can be easily seen that S is not a convex set since both (√r,√r, 0) and

(0, 0, r) belong to S while their convex combination with a weight of 12on each point does

not. The feasible region of S for r = 2 is represented in Figure 3-1 (a) where it can be

observed to be nonconvex.

First, we consider the set S where there are no upper bounds on the variables x, y,

and z. In this case, we can verify that the concave envelope of g(x, y, z) = xy + z is infinite

if both x and y have non-zero values. As a result, the convex relaxation of S obtained by

replacing g(x, y, z) ≥ r with convenv(g(x, y, z)) ≥ r is given by

R(S) ={(x, y, z) ∈ R3

+

∣∣∣ x > 0, y > 0}∪{(x, y, z) ∈ R3

+

∣∣∣ z ≥ r, xy = 0}.

51

This set is not closed and is therefore unlikely to be used as a relaxation. Its closure can

be observed to be R3+. Therefore, the above relaxation scheme corresponds in essence to

dropping the original constraint in the relaxed problem. We observe in Figure 3-1 (b) that

this is clearly not the best convex relaxation. In fact, we will establish in Chapter 4 that

the convex hull of S can be expressed as

conv(S) =

{(x, y, z) ∈ R3

+

∣∣∣√

xy

r+

z

r≥ 1

}.

Note that in the above expression, the right-hand-side plays a different role in each term.

It therefore cannot be naturally obtained as g(x, y, z) ≥ r.

(a) S (b) conv(S) (c) S1 and S2

Figure 3-1. Geometric illustration of S, conv(S), S1 and S2

Next, consider the set SB where the variables have upper bounds and assume r = 2

for simplicity , i.e.,

SB =

(x, y, z) ∈ R3

+

∣∣∣∣∣∣∣xy + z ≥ 2,

x ≤ 4, y ≤ 4, z ≤ 3

.

52

Considering bounds on variables is typically necessary when constructing convexifications

inside of the branch-and-bound tree, where the feasible region has been partitioned

into smaller subsets by branching on variables. In this case, since the concave envelope

of g(x, y, z) is polyhedral, we obtain the following convex relaxation using factorable

relaxation techniques:

FR(SB) =

(x, y, z) ∈ R3

+

∣∣∣∣∣∣∣∣∣∣

3(x+ y) + 8z − 24 ≤ 0,

4x+ z ≥ 2, 4y + z ≥ 2,

x ≤ 4, y ≤ 4, z ≤ 3

.

This relaxation is not the convex hull of SB since conv(SB) is not polyhedral. Therefore,

even in the case of bounded variables, better relaxations than those currently used can be

found.

It is common to derive convexifications from single constraints of the problem

rather than by considering multiple constraints simultaneously. As we discussed in

Section 2.1, multilinear and bilinear constraints are important in the derivation of tight

convexifications as they are common inside of nonlinear programs and their factorable

reformulations. Therefore, in this thesis, we study bilinear covering sets defined by a single

nonlinear inequality of the form∑j∈N

ajxjyj ≥ d, (3–1)

where aj > 0, x ∈ X, and y ∈ Y . This set arises in many practical problems and

theoretical studies. In particular, for the case where X ⊆ Zn and Y ⊆ Zn, bilinear

covering constraints (3–1) can be found in Harjunkoski et al. [64] as we will discuss in

Chapter 4. For the case where X ⊆ {0, 1}n and Y ⊆ [0, 1]n, (3–1) is shown in Chapter 5 to

yield a relaxation of certain single-node flow models that have been studied in the integer

programming literature.

53

3.2 Problem Statements

The premise of this thesis is that it is possible to build tighter convex relaxations of

MINLPs by considering right-hand-side information. Constructing tight convexifications

of MINLPs is an important practical problem because these relaxations can improve the

efficiency of branch-and-bound methods for MINLPs. This would, in turn, help increase

the performance of current state-of-the-art solvers. Although considering right-hand-side is

not common in MINLP, techniques for improving relaxations that use right-hand-side are

common in MILP where they have been shown to help significantly improve bounds and

to reduce the size of the branch-and-bound trees. In this thesis, we are therefore interested

in investigating how such methods can be translated to MINLP.

3.2.1 Strong Valid Inequalities for Orthogonal Disjunctions and BilinearCovering Sets

We can see in Figure 3-1 (c) that the restricted orthogonal subsets,

S1 := S ∩ {z = 0} = {(x, y, 0) | xy ≥ r}

and

S2 := S ∩ {x = 0, y = 0} = {(0, 0, z) | z ≥ r}

completely determine the convex hull of S. Based on this observation, we investigate how

to obtain a closed form expression for conv(S) and many other similar sets in Chapter 4.

In particular, using disjunctive programming, we develop a new convexification tool for

nonlinear sets. Our tool characterizes the convex hull of orthogonal disjunctive sets in

closed-form under some technical conditions. The results differ from current approaches

in that the resulting expressions do not contain exogenous variables. We then show that,

similar to Figure 3-1 (c), the convex hull of many nonlinear sets is completely dictated by

their restrictions over orthogonal subspaces. We provide sufficient conditions to check this

particular type of convex extensions property. We conclude by illustrating how our tools

can be used to obtain the convex hulls of certain nonlinear sets.

54

The convexification tool we develop is useful since it provides a closed-form expression

of convex hulls of many nonlinear sets. However, it is not completely general as it typically

requires that variables are not bounded above.

3.2.2 Lifted Inequalities for 0-1 Mixed-Integer Bilinear Covering Sets withBounded Variables

In Chapters 5 and 6, we study 0−1 Mixed-Integer Bilinear Covering Sets since they

are one of the simple mixed-integer nonlinear sets that have upper bounds on both integer

and continuous variables. We investigate in Chapter 5 how to apply lifting techniques for

these sets. Using sequence-independent lifting, we derive strong valid inequalities for the

convex hull of these sets. We also show that the bilinear covering sets are similar to the

single-node flow models with respect to their polyhedral structure. As a result, we prove

that our results yield generalizations of the classical lifted flow cover inequalities in integer

programming. We then test the practical impacts of these results through a computational

study in Chapter 6.

55

CHAPTER 4STRONG VALID INEQUALITIES FOR ORTHOGONAL DISJUNCTIONS AND

BILINEAR COVERING SETS 1

4.1 Introduction

In Chapter 3, when considering the set

S ={(x, y, z) ∈ R3

+

∣∣∣ xy + z ≥ r},

we discussed that traditional techniques for relaxing the inequality of S would simply

drop the constraint to produce R3+, a relaxation that does not consider right-hand-side

information. In this chapter, we propose a scheme that produces tighter convex relaxations

by considering the right-hand-side of the constraint. In particular, for the set S presented

above, our scheme produces the following convex relaxation

RS =

{(x, y, z) ∈ R3

+

∣∣∣√

xy

r+

z

r≥ 1

},

which is a much tighter approximation than R3+. Considering this simple example, we

can make three observations. First, the relaxation, RS, is nonlinear. This is in contrast

to current implementations of nonlinear branch-and-bound that typically construct linear

relaxations for multivariate terms; see Tawarmalani and Sahinidis [123]. Second, the form

of the nonlinear cut is surprising as it applies different functions to the different terms

of the initial inequality. For S, the first term is modified using a square-root after being

divided by r, while the second is simply divided by r. Third, RS is not only a convex

relaxation of S, but it is in fact (as will be shown later) the convex hull of S. These

observations generalize to many polynomial covering sets; see Tawarmalani et al. [117].

Surprisingly, the convex hull for these sets can be expressed in a simple form without

1 The material of this chapter is based on [118].

56

introducing new variables while developing the concave envelope of the corresponding

polynomial can be much harder.

The convex hull representation for bilinear covering sets arises from a general theory

of orthogonal disjunctions that we develop in this chapter. To provide an example,

consider the set S again. We will show that the convex hull of S is determined by the

points of S that either belong to the half-plane (x, y, 0), where (x, y) ∈ R2+ or to the

half-line (0, 0, z), where z ∈ R+. In other words, the set S satisfies the convex extension

property (see Tawarmalani and Sahinidis [120]) and the important subsets of S belong

to orthogonal subspaces. Because the convex extension property holds, it is natural to

expect that one could build a higher dimensional description of the convex hull of S using

disjunctive programming arguments; see Rockafellar [102] and Balas [17]. Disjunctive

programming has been used to develop tight relaxations and cutting planes in integer,

nonlinear, and robust optimization; see [9, 13, 22, 31, 107, 111, 113, 119]. Unlike our

result, the literature on disjunctive programming formulations mostly focuses on naturally

disjunctive sets. Cutting planes based on disjunctive formulations, are typically linear

and derived by solving separation problems over extended formulations; see Cornuejols

and Lemarechal [38]. One interesting observation in this chapter is that, as long as the

disjunctive terms are orthogonal and a few technical conditions are satisfied, there is no

need to introduce additional variables. Furthermore, the convex hull of S can be easily

expressed in closed-form using the representations of the convex hull of S in each of the

two orthogonal subspaces, namely√

xyr

≥ 1 and zr≥ 1. We establish a much more

general set of conditions under which the argument evoked above is correct, allowing the

use of both right-hand-side and left-hand-side information in the derivation of convex

relaxations for nonlinear programming. Our results rely on the ability to prove that a

convex extension property holds over orthogonal disjunctions and the ability to derive

closed form expressions of convex hulls (possibly in a higher dimensional space) over

each of the subspaces. Our techniques are applicable to large families of problems and

57

yield convex approximations that are stronger than those currently used in nonlinear

branch-and-bound solvers; see Tawarmalani et al. [117].

In Section 4.2, we describe a tool to obtain the convex hull of orthogonal disjunctive

sets. The result can be invoked under certain technical conditions. We provide tools to

verify these assumptions. We also provide counterexamples to show the need for the

assumptions. The split cut for mixed-integer polyhedral sets is shown to be a special case

of our general convexification tool. In Section 4.3, we illustrate the application of the

tool in nonlinear integer programming by convexifying bilinear pure/mixed-integer sets.

Nonconvex inequalities in continuous variables are not naturally disjunctive. For such

inequalities, we establish sufficient conditions under which the convex extension property

holds over the non-negative orthant. We show that these sufficient conditions are satisfied

by continuous bilinear covering sets and develop their convex hulls over the non-negative

orthant. We summarize the contributions of this work in Section 4.4 and conclude with

remarks and directions for future research.

4.2 Convexification of Orthogonal Disjunctive Sets

In this section, we first introduce and prove a general result that exposes the

closed-form convex hull inequality description of the disjunctive union of a finite number

of sets defined over subspaces that are orthogonal to each other. This result also applies

to non-disjunctive sets provided that their convex hulls are entirely defined by their

restrictions over a finite number of orthogonal subspaces. We then illustrate the utility of

this result in finding convex hull descriptions. We discuss the need for certain seemingly

technical assumptions in the statement of the result. In particular, we discuss each one

of the four assumptions of the theorem and describe, with examples, situations where

they are satisfied. For some of the assumptions, we establish sufficient conditions that

are simple to verify. We then show that the cuts that yield the convex hull, under the

specified technical conditions, continue to produce valid inequalities even when some

of the conditions are not satisfied. Throughout, we demonstrate the generality and

58

applicability of our convexification result by deriving new convex hull descriptions of

various continuous, mixed, and pure-integer bilinear covering sets, and providing an

alternate derivation of the classic split cut in mixed-integer programming.

In the following, given a set S, we represent its convex hull by conv(S), its closure by

cl(S), and its projection on the space of z variables by projz S. For a closed convex set,

S, we denote the set of its recession directions by 0+(S). When we display equations, we

sometimes write min

f(z)

g(z)

to denote min{f(z), g(z)}.

While convexifying a given set S, we will often consider its orthogonal restrictions,

that we will denote as Si for i ∈ {1, . . . , n} and define as Si = {z | z = (z1, . . . , zn) ∈S, zj = 0 ∀j 6= i}. To simplify the forthcoming discussions and proofs, we next introduce

notations that help in converting descriptions of points in the sets Si, for i ∈ {1, . . . , n},to descriptions of points in S and vice-versa. Let (z1, . . . , zn) ∈ R

∑ni=1 di , zi ∈ Rdi ,

N = {1, . . . , n}, and A = {i1, . . . , ip} ⊆ N . Then, zA denotes (zi)i∈A ∈ R∑

i∈A di , i.e.,

A provides the index set of subspaces into which z is projected. Conversely, zA may be

injected into the original space by setting the missing coordinates to zero. To succinctly

express this operation, we introduce the following notation. For each j ∈ {1, . . . ,m}, letzj = (zji )

ni=1, where zji ∈ Rdji . Then, given aj ∈ R

∑pk=1 d

jik , we denote by L (A; a1, . . . , am)

the vector (z1, . . . , zm), where for all j, zjA = aj and zjN\A = 0. When A is a singleton {i},we write it as i itself. For each j, the above notation injects aj into the space of the zj

variables by setting the coordinates indexed by A according to their corresponding values

in aj, and the remaining coordinates to zero. For example, L ({1, 3}, (z1, z3), (u1, u3))

equals (z1, 0, z3, 0, . . . , 0;u1, 0, u3, 0, . . . , 0), where the semi-colon is used to demarcate the

z and u vectors. Throughout the text, we will mostly use the following two expressions:

L(i, (zi, ui)) to denote the vector (0, 0, . . . , 0, 0, zi, ui, 0, 0, . . . , 0, 0) and L(i, zi, ui) to denote

the vector (0, . . . , 0, zi, 0, . . . , 0; 0, . . . , 0, ui, 0, . . . , 0), where the semi-colon delineates the

vector z = (z1, . . . , zn) from the vector u = (u1, . . . , un).

59

We next introduce a notation to express a generic set described via inequalities.

Consider functions tj : R∑n

i=1 di × R∑n

i=1 d′i 7→ R for j ∈ J , vk : R

∑ni=1 di × R

∑ni=1 d

′i 7→ R for

k ∈ K and wl : R∑n

i=1 di ×R∑n

i=1 d′i 7→ R for l ∈ L. Let (z, u) ∈ R

∑ni=1 di ×R

∑ni=1 d

′i . Then, we

denote A(tJ , vK , wL

)by the following set:

A(tJ , vK , wL

):=

(z, u)

∣∣∣∣∣∣∣∣∣∣

tj(z, u) ≥ 1,∀j ∈ J,

vk(z, u) ≥ −1, ∀k ∈ K,

wl(z, u) ≥ 0,∀l ∈ L

,

where J , K, and L are the index sets of inequalities with 1, 0, and −1 right-hand-sides

respectively. Note that there is no loss of generality in assuming that the right-hand-sides

are 1, 0, and −1 since the defining functions can be scaled by a positive multiplier to

satisfy this condition. We will often be interested in the set where the right-hand-sides of

the above inequalities are replaced with 0. This set is denoted by C(tJ , vK , wL

)and is

defined formally as follows:

C(tJ , vK , wL

)=

(z, u)

∣∣∣∣∣∣∣∣∣∣

|tj(z, u) ≥ 0,∀j ∈ J,

vk(z, u) ≥ 0,∀k ∈ K,

wl(z, u) ≥ 0,∀l ∈ L

.

Among alternative inequality descriptions of sets, we will see that it is often beneficial

to choose those that involve positively homogeneous functions. We present this concept in

the following definition; see §4 in [102] for details.

Definition 4.1. A function f(z) : Rn 7→ [−∞,∞] is said to be positively homogeneous if,

for λ > 0, f(λz) = λf(z).

We now describe our main convexification result.

Theorem 4.1. Let S ⊆ R∑

i di and let the points z of S be written as z = (z1, . . . , zi, . . . , zn) ∈S, where zi ∈ Rdi. For i ∈ N = {1, . . . , n}, let Si ⊆ S. Assume that:

(A1) if (z1, . . . , zi, . . . , zn) ∈ Si, then zj = 0 for ∀j 6= i,

(A2) conv(S) = conv(∪ni=1Si),

60

(A3) there exists, for i ∈ N , positively-homogeneous functions tji for j ∈ Ji, vki for

k ∈ Ki, and wli for l ∈ Li such that conv(Si) ⊆ projz Ai ⊆ cl(conv(Si)), where

2

Ai ={L(i, (zi, ui))

∣∣∣ (zi, ui) ∈ A(tJii , v

Kii , wLi

i

)}, (4–1)

(A4) projz Ci where Ci ={L(i, (zi, ui))

∣∣∣ (zi, ui) ∈ C(tJii , vKii , wLi

i )}, is a subset of the

recession cone of cl conv (⋃n

i=1 Si), i.e., for all i,

projz

Ci ⊆ 0+

(cl conv

(n⋃

i=1

Si

)).

Then, conv(S) ⊆ projz X ⊆ cl conv(S) where3

X =

(z, u)

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

∑i∈N

tjii (zi, ui) ≥ 1, ∀(ji)i∈N ∈∏i∈N

Ji

∑i∈I

vkii (zi, ui) ≥ −1, ∀I ⊆ N, ∀(ki)i∈I ∈∏i∈I

Ki

tjii (zi, ui) + vkii (zi, ui) ≥ 0, ∀i ∈ N, ∀ji ∈ Ji,∀ki ∈ Ki

tjii (zi, ui) ≥ 0, ∀i ∈ N, ∀ji ∈ Ji

wlii (zi, ui) ≥ 0, ∀i ∈ N, ∀li ∈ Li

. (4–2)

Before proving Theorem 4.1, we briefly comment on its assumptions, its practical

importance, and its applicability. In Assumption (A2), we impose that any point in

S can be expressed as a convex combination of points in some of the sets Si. This

implies that only the subsets Si, for i = 1, . . . , n are needed when computing the

convex hull of S. In Assumption (A1), we require that these subsets belong to linear

subspaces that are orthogonal to each other. In Assumption (A3), we require that

an inequality description of the convex hull of each one of the sets Si is known. Note

that this inequality description might make use of an extended formulation (using the

2 As defined above, L (i, (zi, ui)) denotes (0, 0, . . . , 0, 0, zi, ui, 0, 0, . . . , 0, 0).

3 Here, and onwards,∏

i∈N Ji denotes J1 × · · · × Jn.

61

additional variables ui). Note also that in Theorem 4.1, we require that all inequalities

are defined using positively-homogeneous functions. We will show later that weaker

assumptions are sufficient to establish the validity of the cuts derived in Theorem 4.1

and that positive-homogeneity guarantees that the inequalities produced are strong. In

Assumption (A4), we impose, in essence, that the recession directions of each one of the

sets Ai are also recession directions for the closure convex hull of the union of the sets Si.

Under these four assumptions, we show that an inequality description of the convex

hull of S can be obtained by combining in a systematic way the inequalities arising in the

convex hull descriptions of the subsets Si, for i = 1, . . . , n. Note however that, for reasons

that will be described later, this inequality description might describe a superset of the

desired convex hull. However, the superset will never be larger than the closure convex

hull of S, which is sufficient for all practical purposes. This result bears some resemblance

to the work of Balas et al. [19] where the authors derive a closed-form representation

of the convex hull of certain orthogonal bounded linear polytopes using specialized

arguments. Theorem 4.1 generalizes this result as it allows the convexification of nonlinear

and possibly unbounded orthogonal disjunctive sets and therefore extends its applicability

to global optimization.

To prove Theorem 4.1, we introduce some notation. For T ⊆ N and λT ∈ R+, we

define:

RT (λT ) =

(zT , uT )

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

∑i∈T

tjii (zi, ui) ≥ λT , ∀(ji)i∈T ∈∏i∈T

Ji

∑i∈I

vkii (zi, ui) ≥ −λT , ∀I ⊆ T, ∀(ki)i∈I ∈∏i∈I

Ki

tjii (zi, ui) + vkii (zi, ui) ≥ 0, ∀i ∈ T, ∀ji ∈ Ji,∀ki ∈ Ki

tjii (zi, ui) ≥ 0, ∀i ∈ T, ji ∈ Ji

wlii (zi, ui) ≥ 0, ∀i ∈ T, ∀li ∈ Li

.

62

In particular, note that RN(1) = X. Whenever T is a singleton, say {i}, we denote

R{i}(λ{i}

)as Ri (λi). Further, we let

Q =

(λ, z, u)

∣∣∣∣∣∣∣∣∣∣∣

λi ≥ 0, ∀i ∈ N

(zi, ui) ∈ Ri (λi) , ∀i ∈ Nn∑

i=1

λi = λ1,...,n = 1

. (4–3)

The proof of Theorem 4.1 will be carried out in two steps: (i) Lemma 4.1, which

exploits the disjunctive structure of the convex hull of S implied by Assumption (A2) to

construct a higher-dimensional representation of conv(S), see set Q defined in (4–3); and

(ii) Lemma 4.2, which projects this higher-dimensional representation to the space of the

original variables, see set X defined in (4–2). We now carry out the first step of the proof

in Lemma 4.1. In particular, we use Assumptions (A2) and (A3) to identify a set Q whose

projection in the z space is included in cl conv(S) and includes conv(S). The subsequent

lemma will then project out the λ variables added in the definition of Q to derive X.

Lemma 4.1. For S as defined in Theorem 4.1, conv(S) ⊆ projz Q ⊆ cl conv(S).

Proof. We first show that if z ∈ conv (⋃n

i=1 Si), it can be extended to a point that belongs

to Q by suitably defining (λ, u). By Assumption (A2), this proves the first inclusion. If

z ∈ conv (⋃n

i=1 Si), then, by Assumption (A1), there exist λi and z′i such that

z = (z1, . . . , zi, . . . , zn) =n∑

i=1

λiL(i, z′i),

where, for each i, λi ≥ 0, L(i, z′i) ∈ conv(Si), and∑n

i=1 λi = 1. By Assumption (A3), the

points L(i, z′i) ∈ conv(Si) can be extended to L(i, (z′i, u′i)) ∈ Ai. Let u =

∑ni=1 λiL(i, u′

i)

so that (z, u) =∑n

i=1 λiL(i, (z′i, u′i)). We reindex Si so that the sets containing the points

associated with non-zero multipliers are indexed from 1 to t. Then, λi > 0 for i = 1, . . . , t

and∑t

i=1 λi = 1. Observe that λiz′i = zi and λiu

′i = ui. Since Ri (1) = proj(zi,ui)

Ai, it

63

follows that (z′i, u′i) ∈ Ri (1) for each i ∈ {1, . . . , t}, and, therefore,

tjii (z′i, u

′i) ≥ 1 ∀ji ∈ Ji

vkii (z′i, u′i) ≥ −1 ∀ki ∈ Ki

tjii (z′i, u

′i) + vkii (z′i, u

′i) ≥ 0 ∀ji ∈ Ji,∀ki ∈ Ki

tjii (z′i, u

′i) ≥ 0 ∀ji ∈ Ji

wlii (z

′i, u

′i) ≥ 0 ∀li ∈ Li.

After substituting (z′i, u′i) =

(ziλi, ui

λi

)for each i ∈ {1, . . . , t} and multiplying both sides of

the inequalities by the positive value λi, we obtain:

λitjii

(ziλi

,ui

λi

)≥ λi ∀ji ∈ Ji

λivkii

(ziλi

,ui

λi

)≥ −λi ∀ki ∈ Ki

λitjii

(ziλi

,ui

λi

)+ λiv

kii

(ziλi

,ui

λi

)≥ 0 ∀ji ∈ Ji,∀ki ∈ Ki

λitjii

(ziλi

,ui

λi

)≥ 0 ∀ji ∈ Ji

λiwlii

(ziλi

,ui

λi

)≥ 0 ∀li ∈ Li.

The above argument can be used to express the convex hull of any disjunctive collection

of convex sets by introducing the λ variables; see Theorem 1 of Ceria and Soares [31].

However, because tjii , vkii , and wli

i are positively-homogeneous by Assumption (A3) and

λi > 0, the above system of inequalities can be rewritten as:

tjii (zi, ui) ≥ λi ∀ji ∈ Ji

vkii (zi, ui) ≥ −λi ∀ki ∈ Ki

tjii (zi, ui) + vkii (zi, ui) ≥ 0 ∀ji ∈ Ji,∀ki ∈ Ki

tjii (zi, ui) ≥ 0 ∀ji ∈ Ji

wlii (zi, ui) ≥ 0 ∀li ∈ Li,

64

which implies that (zi, ui) ∈ Ri (λi). Therefore, it follows that, for each i ∈ {1, . . . , t},(λi, zi, ui) is such that λi > 0 and (zi, ui) ∈ Ri (λi). Additionally, we set (zi, ui) = 0 for

t < i ≤ n. Since tjii (0, 0) = λtjii(0λ, 0λ

)for λ > 0, it follows that tjii (0, 0) = 0. Similarly,

for all i, ji ∈ Ji, ki ∈ Ki, and li ∈ Li, tjii (0, 0) = wli

i (0, 0) = vkii (0, 0) = 0. It follows

that (0, 0) ∈ Ri (0). In other words, for each i ∈ N , (λi, zi, ui) is such that λi ≥ 0 and

(zi, ui) ∈ Ri (λi). Therefore, (λ, z, u) ∈ Q.

Now, we show that if (λ, z, u) ∈ Q then z ∈ cl conv (⋃n

i=1 Si). Again by Assumption

(A2), this proves the second inclusion. Clearly, if (λ, z, u) ∈ Q and λi > 0, then by

positive homogeneity of tjii , vkii , and wli

i , it follows that (ziλi, ui

λi) ∈ Ri (1). As before,

then 1λiL(i, (zi, ui)) ∈ Ai. Assume without loss of generality, by reindexing Si if

necessary, that λi > 0 for i = 1, . . . , t and λi = 0 for i = t + 1, . . . , n. Then, it

follows easily that L ({1, . . . , t}, (z1, u1, . . . , zt, ut)) ∈ conv(⋃n

i=1Ai) since it can be

expressed as a convex combination of points in⋃t

i=1Ai. Since projz conv(⋃n

i=1Ai) ⊆conv(

⋃ni=1 projz Ai) and, by Assumption (A3), projz Ai ⊆ cl conv(Si), it follows that

L ({1, . . . , t}, (z1, . . . , zt)) ∈ conv(⋃n

i=1 cl conv(Si)) ⊆ cl conv (⋃n

i=1 Si). Now, since λt+1 = 0,

then by Assumption (A4), it follows that L(t + 1, zt+1) ∈ 0+(cl conv (⋃n

i=1 Si)). Therefore,

L ({1, . . . , t+ 1}, (z1, . . . , zt+1)) ∈ cl conv (⋃n

i=1 Si). By induction, z ∈ cl conv (⋃n

i=1 Si).

Lemma 4.1 deals with disjunctive sets and is inspired by the work in disjunctive

programming. We next describe the differences in our approach, which, although subtle,

play a significant role in obtaining our results. First observe that a significant emphasis

in the disjunctive programming literature is on facial disjunctive programs, see §6 in

Balas [15], since mixed 0−1 programs can be expressed in this form. It should be noted

that the disjunctive problem defined in Theorem 4.1 is not necessarily facial. In fact, the

disjunctions Si may lie in the interior of the convex hull (see Example 4.1 and Figure

4-1(b)). Nevertheless, this first step resembles Theorem 3.3 in Balas [16] for linear

disjunctive sets or Theorem 1 in Ceria and Soares [31] for convex disjunctive sets. We

however emphasize that the first step also exploits Assumption (A3) in which we assume

65

that we know positively homogeneous inequality representations of the sets Si to produce

a simplified high-dimensional representation of the convex hull; see (4–3).

Now, we carry out the second step of the proof in Lemma 4.2. In particular, we

prove that the projection of Q onto the space of (z, u) variables is X, whose closed-form

expression was already provided in (4–2).

Lemma 4.2. X = projz,u Q.

Proof. The proof proceeds by induction. Given two disjoint subsets A and B of N , we

consider

W =

(λA, λB, λA∪B, zA, uA, zB, uB)

∣∣∣∣∣∣∣∣∣∣

λA ≥ 0, (zA, uA) ∈ RA (λA)

λB ≥ 0, (zB, uB) ∈ RB (λB)

λA + λB = λA∪B

,

and

P ={(λA∪B, zA∪B, uA∪B)

∣∣∣ λA∪B ≥ 0, (zA∪B, uA∪B) ∈ RA∪B (λA∪B)}.

We first show that if zA∪B = (zA, zB) and uA∪B = (uA, uB), then P is the set obtained

when λA and λB are projected out from W . Note that since A and B are disjoint and

zA∪B ∈ R|∑

i∈A di+∑

i∈B di| = R|∑

i∈A di| × R|∑

i∈B di|, the definitions of zA∪B and, similarly,

uA∪B are dimensionally consistent. We first substitute λB = λA∪B−λA and then project λA

out using Fourier-Motzkin elimination. Note that λA appears linearly in all the inequalities

defining W . Therefore, we are able to use a procedure similar to Theorem 1.4 in [141]. We

substitute λB = λA∪B − λA in W to obtain:

λA ≥ 0

(zA, uA) ∈ RA (λA)

λA∪B − λA ≥ 0

(zB, uB) ∈ RB (λA∪B − λA) .

66

On the one hand, note that the inequalities

tjii (zi, ui) + vkii (zi, ui) ≥ 0 ∀i ∈ A ∪B, ∀ji ∈ Ji,∀ki ∈ Ki (4–4)

tjii (zi, ui) ≥ 0 ∀i ∈ A ∪B, ∀ji ∈ Ji (4–5)

wlii (zi, ui) ≥ 0 ∀i ∈ A ∪B, ∀li ∈ Li (4–6)

remain untouched during projection since they are independent of λA. On the other hand,

the inequalities containing λA can be rewritten as:

min

∑i∈A

tjii (zi, ui)

λA∪B + minB′⊆B

∑

i∈B′vkii (zi, ui)

≥ λA ≥ max

λA∪B −∑i∈B

tjii (zi, ui)

− minA′⊆A

∑

i∈A′vkii (zi, ui)

so that Fourier-Motzkin elimination is simple to perform. Observe that the constraints

λA∪B − λA ≥ 0 and λA ≥ 0 are represented in the above system respectively when A′ = ∅and B′ = ∅. Projecting λA out of the system, we obtain:

∑i∈A∪B

tjii (zi, ui) ≥ λA∪B ∀(ji)i∈A∪B ∈∏

i∈A∪BJi (4–7)

∑i∈A

tjii (zi, ui) +∑

i∈A′vkii (zi, ui) ≥ 0 ∀A′ ⊆ A,∀(ji)i∈A ∈

∏i∈A

Ji,∀(ki)i∈A′ ∈∏

i∈A′Ki (4–8)

∑i∈B

tjii (zi, ui) +∑

i∈B′vkii (zi, ui) ≥ 0 ∀B′ ⊆ B, ∀(ji)i∈B ∈

∏i∈B

Ji,∀(ki)i∈B′ ∈∏

i∈B′Ki (4–9)

∑

i∈A′∪B′vkii (zi, ui) ≥ −λA∪B ∀B′ ⊆ B, ∀A′ ⊆ A,∀(ki)i∈A′∪B′ ∈

∏

i∈A′∪B′Ki. (4–10)

Inequalities (4–4) for i ∈ A′ and (4–5) for i ∈ A\A′ imply that (4–8) is redundant.

Similarly, (4–9) can be shown to be redundant. Observe that λA∪B ≥ 0 is represented in

(4–10) by selecting A′ = B′ = ∅. Therefore, the set obtained by projecting λA and λB

out of W is given by (4–4), (4–5), (4–6), (4–7), and (4–10), which is exactly the definition

67

of P . By applying this result sequentially with A = {1, . . . , i′} and B = {i′ + 1} and

increasing i′ from 1 to n− 1, we obtain that projz,u Q = RN(1) = X.

Lemma 4.2 projects a higher-dimensional representation of the convex hull to the

space of the original variables. For linear systems, such a projection can be obtained

algorithmically using the wrapping procedure, see Fukuda et al. [55]; the Fourier-Motzkin

procedure, see Ziegler [141]; or the extreme-ray characterization of the projection cone, see

Sections 1, 2 and 5 of Balas [18] for a discussion of projection in the context of disjunctive

programming. However, the projected set is rarely described in closed-form. Using

Assumption (A1) in which we assume that the sets we convexify are orthogonal, we show

that the projection can be obtained in closed-form, despite the fact that S is nonlinear; see

(4–2).

The proof of Theorem 4.1 is now straightforward.

Proof of Theorem 4.1. It suffices to show that conv(S) ⊆ projz Q = projz X ⊆ cl conv(S)

where the inclusions follow from Lemma 4.1 and the equality follows from Lemma 4.2.

The proof exposes some differences between Theorem 4.1 and the results of Balas [15].

Although it is clear in Balas [15], Balas et al. [20], and Balas [18] that valid inequalities

for the convex hull of the disjunctive union of polyhedral sets can be obtained by

projecting down its high-dimensional representation onto the initial space of variables, this

projection is usually not performed explicitly. Instead, with orthogonal disjunctions and

positively-homogeneous functions, we show in Lemma 4.2 and the proof of Theorem 4.1

that Fourier-Motzkin elimination can be used to obtain a closed-form expression of the

convex hull in the space of the original variables. Further, earlier studies recommend

solving a cut generation linear program to generate valid inequalities for separating

solutions that do not belong to the convex hull of the disjunctive union. In contrast, it is

straightforward to find an inequality that separates X from a point that does not belong

to X.

68

Next, we illustrate the use of Theorem 4.1 in deriving the convex hulls of several

simple orthogonal disjunctive sets. In particular, we describe a situation where there exists

an i′ ∈ N such that Ji′ = ∅. Then, it follows by Assumption (A3) that 0 ∈ cl conv(S).

In other words, X cannot include any inequality of the form∑

i∈N tjii (zi, ui) ≥ 1. Indeed,

since Ji′ = ∅, it follows that ∏ni=1 Ji = ∅.

Example 4.1. Consider first an instance where, for each i, Ji 6= ∅. In particular,

consider S ⊆ R2+, defined as S = S1 ∪ S2, where S1 = {(z1, 0) | 1 ≤ z1 ≤ 2} and

S2 = {(0, z2) | z2 ≥ 1}. It can be easily verified that

conv(S) ={(z1, z2)

∣∣∣ z1 + z2 ≥ 1, z1 ≥ 0, z1 < 2, z2 ≥ 0}∪ {(2, 0)} .

We now apply the convexification tool of Theorem 4.1 to S and derive a set X that con-

tains conv(S) but is no larger than cl conv(S). First, we verify that the set S satisfies

the assumptions of Theorem 4.1. Clearly, Assumptions (A1) and (A2) hold by the defi-

nition of S. Next, it is easy to verify that conv(S1) ={(z1, 0) | z1 ≥ 1,−1

2z1 ≥ −1

}and

conv(S2) = {(0, z2) | z2 ≥ 1}. Since z1, −12z1, and z2 are linear, and therefore, positively-

homogeneous, Assumption (A3) clearly holds. Finally, since C1 = {(0, 0)} ⊆ 0+(cl conv(S))

and C2 = {(0, z2) | z2 ≥ 0} ⊆ 0+(cl conv(S2)) ⊆ 0+(cl conv(S)), then Assumption (A4) also

holds. Applying Theorem 4.1, we obtain that

X ={(z1, z2)

∣∣∣ z1 + z2 ≥ 1, z1 ≤ 2, z1 ≥ 0, z2 ≥ 0}.

In fact, it is apparent for this example that X = cl conv(S); see Figure 4-1(a).

Consider now instances where J2 = ∅. In particular, let the set S ′ ⊆ R2 be defined

as S ′ = S ′1 ∪ S ′

2 where S ′1 = {(z1, 0) | z1 ≥ 1} and S ′

2 = {(0, z2) | z2 ≥ −1}. Then, itfollows easily that conv(S ′) = {(z1, z2) | z1 ≥ 0, z2 > −1} ∪ (0,−1); see Figure 4-1(b).

Theorem 4.1 yields X ′ = {(z1, z2) | z1 ≥ 0, z2 ≥ −1} which is cl conv(S ′). Similarly, now

consider S = S ′′1 ∪ S ′′

2 where S ′′1 = {(z1, 0) | z1 ≥ −1} and S ′′

2 = {(0, z2) | z2 ≥ −1}. Then,

conv(S ′′) ={(z1, z2)

∣∣∣ z1 + z2 ≥ −1, z1 > −1, z2 > −1}∪ (−1, 0) ∪ (0,−1),

69

see Figure 4-1(c). In this case, Theorem 4.1 yields

X ′′ ={(z1, z2)

∣∣∣ z1 ≥ −1, z2 ≥ −1, z1 + z2 ≥ −1}

which is cl conv(S ′′).

0

0.5

1

1.5

2

2.5

3

0 0.5 1 1.5 2 2.5 3

z2

z1

conv(S)

S1

S2

(a)

-2

-1

0

1

2

3

-2 -1 0 1 2 3

z2

z1

conv(S′)

S′

1

S′

2

(b)

-2

-1

0

1

2

3

-2 -1 0 1 2 3

z2

z1

conv(S′′)

S′′

1

S′′

2

(c)

Figure 4-1. Illustration of Theorem 4.1 with (a) J1 6= ∅, J2 6= ∅ (b) J2 = ∅ (c) J1 = J2 = ∅

Example 4.1 shows different instances where conv(S) ( projz X. In Example 4.2, we

illustrate that, in some cases, projz X might be strictly contained in cl conv(S). Together,

these examples show that projz X can be different from conv(S) and cl conv(S) and, in

that sense, the result of Theorem 4.1 is as tight as possible.

Example 4.2. Consider the set S =⋃n

i=1 Si, where

Si = projz

{L (i, (zi, ui)) ∈ R2n

+

∣∣∣ √ziui ≥ 1}=

{L(i, zi)

∣∣∣ zi > 0}.

Clearly, Assumptions (A1) and (A2) hold by the definition of S. Since√ziui is positively-

homogeneous, Assumption (A3) is also satisfied. Here,

projz

Ci = projz

{L (i, (zi, ui)) ∈ R2n

+

∣∣∣ √ziui ≥ 0}=

{L(i, zi)

∣∣∣ zi ≥ 0}⊆ 0+ (cl conv(S)) .

70

Therefore, Assumption (A4) holds. Applying Theorem 4.1, we obtain that

X =

{(z, u) ∈ R2n

+

∣∣∣n∑

i=1

√ziui ≥ 1

}.

If, for any i, zi > 0 then there exists u such that (z, u) ∈ X. Further, for all u, it is

easy to see that (0, u) 6∈ X. Therefore, projz X ={z ∈ Rn

+ | ∑ni=1 zi > 0

}. This example

illustrates that if projz Ai is not closed then projz X may not be closed either and that, in

some cases, projz X ( cl conv(S).

In the above example, we exploit the fact that the sets projz Ai are not closed to show

that projz X may not be closed either. Instead, if, for all i, the sets projz Ai are closed

then, as shown in the following corollary, projz X is closed as long as recessions directions

are well-behaved.

Corollary 4.1. If, in addition to the assumptions of Theorem 4.1, projz Ai is a closed set

and projz Ci = 0+(cl conv(Si)) ∀i ∈ N , then projz X = cl conv(S).

Proof. Since the sets Si are orthogonal, there do not exist vectors ψi = L(i, zi) ∈projz Ci, not all zero, such that

∑ni=1 ψi = 0. Define Ti(λi) = λi cl conv(Si) for

λi > 0 and Ti(0) = 0+(cl conv(Si)). Then, by Theorem 9.8 in [102], it follows that⋃n

i=1 {z | ∑ni=1 λi = 1, zi ∈ Ti(λi)}, denoted hereafter as T , equals cl conv(S). If z ∈ T ,

then there exists a λ such that zi ∈ Ti(λi). If λi > 0, then ziλi

∈ cl conv(Si), and

therefore, there exists a ui such that(

ziλi, ui

λi

)∈ Ai. On the other hand, if λi = 0,

there exists a ui such that (zi, ui) ∈ Ci. Since Ai and Ci (restricted to the space of zi

and ui variables) are Ri(1) and Ri(0) respectively, it follows that (λ, z, u) ∈ Q and so

z ∈ projz X and cl conv(S) ⊆ projz X. However, we have already shown in Theorem 4.1

that projz X ⊆ cl conv(S) and, therefore, projz X equals cl conv(S).

In Corollary 4.1, orthogonality plays a key role in identifying the closed convex

hull. In fact, orthogonality implies that adding the recession directions of cl conv(Si)

to cl conv(Sj) for j 6= i does not yield sets that are not closed; see Corollary 9.1.1 in

Rockafellar [102]. This fact is exploited in the proof of the corollary. In the absence of

71

orthogonality or polyhedrality, closedness can only be established under various technical

conditions; see Ceria and Soares [31].

The definition of X as in (4–2) provides a simple and complete description of

cl conv(S) in many practical situations. However, in certain cases, some of the inequalities

in (4–2) may be redundant. To illustrate this observation, we consider a situation

where the sets A′i = proj(zi,ui)

Ai are completely described by a finite number of linear

inequalities. We then show that when Theorem 4.1 is used to derive inequalities for

X using facet-defining inequalities for the sets A′i = proj(zi,ui)

Ai, then the resulting

inequalities are not always facet-defining for X. More precisely, let zi ∈ Rdi and

ui ∈ Rd′i . Assume that A′i are full-dimensional sets in Rdi+d′i . If, for each i and ji

(resp. ki), the inequalities tjii (zi, ui) ≥ 1 (resp. vkii (zi, ui) ≥ −1) are facet-defining for

A′i then

∑i∈N tjii (zi, ui) ≥ 1 (resp.

∑i∈I v

kii (zi, ui) ≥ −1 with I = N) is facet-defining

for X. Similarly, if, for some i and li, wlii (zi, ui) ≥ 0 is facet-defining for A′

i then it is

also facet-defining for X. However, the inequalities∑

i∈I vkii (zi, ui) ≥ −1 for I ( N ,

tjii (zi, ui)+vkii (zi, ui) ≥ 0, and tjii (zi, ui) ≥ 0 are not necessarily facet-defining. For example,

consider

S1 =

{(x, 0, 0)

∣∣∣ x ≥ 0,−1

2x ≥ −1

}

and

S2 =

{(0, y, z)

∣∣∣ y + z ≥ 1,−1

2y ≥ −1,−1

2z ≥ −1, y ≥ 0, z ≥ 0

}.

Then, the inequalities y + z − 12y ≥ 0 and y + z ≥ 0 are not facet-defining since they are

implied by y ≥ 0 and z ≥ 0. Similarly, the inequality −12y ≥ −1 is not facet-defining since

it is implied by −12x− 1

2y ≥ −1 and x ≥ 0.

We now discuss each of the assumptions of Theorem 4.1. We first turn our attention

to Assumption (A1). This assumption requires that the sets Si belong to linear subspaces

that are orthogonal to each other. A weaker assumption however suffices to prove the

theorem. Consider Li, for i ∈ {1, . . . , n}, to be linear subspaces of R∑n

i=1 di , where Li

has dimension di. Further, assume that a vector zi ∈ Li cannot be expressed as a linear

72

combination of vectors in {L1, . . . ,Li−1,Li+1, . . . ,Ln}. In this case, it is possible to

construct a matrix B whose columns form a basis for R∑n

i=1 di where the columns, that

are indexed from 1 +∑j−1

i=1 di to∑j

i=1 di, form a basis for Lj. Then, define new variables

s such that s = B−1z. If z ∈ Sj ⊆ Lj, it follows that sk 6= 0 only if 1 +∑j−1

i=1 di ≤k ≤ ∑j

i=1 di. Therefore, Theorem 4.1 now applies to the transformed space of s variables.

This observation leads to the following simple derivation of the split cut in mixed-integer

programming.

Example 4.3. Consider a polyhedral cone P = {x | Ax ≤ b}, where A ∈ Rn×n is an

invertible matrix. Let X be the set of points that satisfy the disjunction

πTx ≤ π10 ∨ πTx ≥ π2

0,

where π10 < π2

0. We are interested in deriving the convex hull of P ∩X. Observe that this

setting can be used to derive all split cuts; see Balas [12]. Introducing the slack variables µ

and defining γ = πTA−1, γ10 = γb− π2

0, and γ20 = γb− π1

0, we reduce the above problem into

one involving convexification of

M ={µ∣∣∣ µ ≥ 0, γµ ≤ γ1

0 ∨ γµ ≥ γ20

}.

We assume without loss of generality that, for each i, γi 6= 0. The reformulation of the

problem in the space of the slack variables, after suitable translation, is an example of the

orthogonalization discussed above. Here, µ corresponds to −s and x corresponds to z. The

matrix B equals A−1 and its columns are the extreme rays of P . If µ = 0 is feasible to

M, then conv(M) = {µ | µ ≥ 0} since µ ≥ 0 is the recession cone for M, whenever Mcontains a feasible point. Instead, if µ = 0 is not feasible to M, then γ1

0 < 0 and γ20 > 0.

Define pi =γ10

γiand qi =

γ20

γi. It follows that, for each i, exactly one of pi or qi is greater

than 0. Since µi ≥ 0 is a recession direction for conv(M) and the extreme points of M

73

have at most one non-zero, it follows that:

conv(M) = conv

{n⋃

i=1

{L(i, µi)

∣∣∣ µi ≥ max{pi, qi}}}

.

Now, applying Theorem 4.1, it follows that:

conv(M) =

{µ∣∣∣

n∑i=1

µi

max{pi, qi} ≥ 1, µ ≥ 0

}.

Substituting back µ, pi, and qi in the above, we obtain:

conv(M) =

x

∣∣∣∣n∑

i=1

(b− Ax)i

max{

πTA−1·i b−π2

0

πTA−1·i

,πTA−1

·i b−π10

πTA−1·i

} ≥ 1, Ax ≤ b

.

We next discuss Assumption (A3). This assumption requires that the convex hulls of

the sets Si be known, possibly in a higher dimensional space, and that the functions tjii ,

for all ji ∈ Ji, vkii , for all ki ∈ Ki, and wli

i , for all li ∈ Li, used in the description of the

convex hulls be positively-homogeneous. In the ensuing example, we show that a simple

transformation might suffice to convert the natural inequality description of conv(Si) into

one that uses positively-homogeneous functions. We also demonstrate that if the defining

functions are not positively-homogeneous, then (4–2) does not necessarily contain conv(S).

Example 4.4. Let S =⋃n

i=1 Si, where Si ={L (i, xi, yi) ∈ R2n

+

∣∣ xiyi ≥ r}and r > 0.

Clearly, Assumptions (A1) and (A2) hold by the definition of S. Since Si is already closed

and convex, cl conv(Si) = Si, i.e.,

cl conv(Si) =

{L(i, xi, yi) ∈ R2n

+

∣∣∣ 1rxiyi ≥ 1

}.

The above representation of cl conv(Si) does not directly satisfy Assumption (A3) since

1rxiyi is not a positively-homogeneous function of (xi, yi). However, cl conv(Si) may be

rewritten as

cl conv(Si) =

{L (i, xi, yi) ∈ R2n

+

∣∣∣√

1

rxiyi ≥ 1

},

74

an expression that uses the function,√

1rxiyi, which is positively-homogeneous in (xi, yi).

With this representation, Assumption (A3) is satisfied. Since

Ci ={L (i, xi, yi) ∈ R2n

+

∣∣∣ √xiyi ≥ 0}= 0+(cl conv(Si)),

Assumption (A4) is satisfied. Therefore, Theorem 4.1 implies that

X = cl conv(S) =

{(x, y) ∈ R2n

+

∣∣∣n∑

i=1

√xiyi ≥

√r

}.

Observe finally that the transformation to positively-homogeneous functions is necessary

and not an artifact of the proof technique. In fact, if we use the original definition of

cl conv(Si), when applying Theorem 4.1, and disregard the lack of positive-homogeneity,

the resulting set would be X ′ = {(x, y) ∈ R2n+ | ∑n

i=1 xiyi ≥ r}. The set X ′ is non-

convex and does not even contain conv(S). To see this, let r = 1 and n = 2. Note

that (x1, y1, x2, y2) = (0.5, 0.5, 0.5, 0.5) is expressible as a convex combination with equal

weights of (1, 1, 0, 0) ∈ S1 and (0, 0, 1, 1) ∈ S2. Therefore, (0.5, 0.5, 0.5, 0.5) belongs to

conv(S). However, it does not satisfy the defining inequality of X ′ whereas it does satisfy

the defining inequality of X.

If λitjii

(ziλi, ui

λi

)≤ tjii (zi, ui) for all λ ∈ (0, 1], then X still outer-approximates

cl conv(S). An intuitive explanation for this result is that, when performing Fourier-Motzkin

elimination, λitjii

(ziλi, ui

λi

)≤ tjii (zi, ui) ensures that X contains the closure convex hull of

the disjunctive union of Si, whereas λitjii

(ziλi, ui

λi

)≥ tjii (zi, ui) guarantees that X is

contained in cl conv (⋃n

i=1 Si). Similar statements can be made about vkii (zi, ui) and

wlii (zi, ui). The latter of these conditions will be explored further in Proposition 4.8 to

derive sufficient conditions that help verify a relaxed version of Assumption (A2).

We now turn our attention to Assumption (A4). At a first glance, this assumption

might appear technical and difficult to verify in practice. However, this is not the case.

We show next that by simply requiring that the functions tjii , vkii , and wli

i are concave, in

addition to being positively-homogeneous, Assumption (A4) is automatically satisfied.

75

Proposition 4.1. If, for all i, ji ∈ Ji, ki ∈ Ki, and li ∈ Li, the functions tjii , vkii , and wli

i ,

as defined in Theorem 4.1, are concave in addition to being positively-homogeneous, and

the sets Si are not empty, then projz Ci ⊆ 0+(cl conv (⋃n

i=1 Si)), i.e., Assumption (A4) is

satisfied.

Proof. Let L(i, zi) ∈ Si. By Assumption (A3), there exists ui such that L (i, (zi, ui)) ∈ Ai.

Consider L (i, (z′i, u′i)) ∈ Ci and α > 0. Then, by positive homogeneity and concavity of tjii ,

it follows that

tjii (zi + αz′i, ui + αu′i) ≥ tjii (zi, ui) + tjii (αz

′i, αu

′i) = tjii (zi, ui) + αtjii (z

′i, u

′i) ≥ tjii (zi, ui) ≥ 1.

The first inequality holds because of Theorem 4.7 in [102], the first equality because tjii s

are positively-homogeneous, the second inequality because L(i, (z′i, u′i)) ∈ Ci and α > 0,

and the last inequality because L (i, (zi, ui)) ∈ Ai. Similarly, vkii (zi + αz′i, ui + αu′i) ≥ −1

and wlii (zi + αz′i, ui + αu′

i) ≥ 0. Therefore, (zi + αz′i, ui + αu′i) ∈ Ai and so, for all α > 0,

L(i, zi + αz′i) ∈ cl conv(Si) ⊆ cl conv (⋃n

i=1 Si). Since L(i, zi) ∈ cl conv (⋃n

i=1 Si), it follows

by Theorem 8.3 in [102] that (0, z′i, 0) ∈ 0+(cl conv (⋃n

i=1 Si)).

The assumption that Si is not empty plays an important role in Proposition 4.1.

Consider for example

S1 =

(z1, z2, 0)

∣∣∣∣∣∣∣∣∣∣

z1 − z2 ≥ 1,

−z1 + z2 ≥ 1,

z1 ≥ 0, z2 ≥ 0

and

S2 = {(0, 0, z3) | z3 ≥ 1}.

Then, S1 is empty but C1 = {(z1, z2, 0) | z1 = z2, z1 ≥ 0, z2 ≥ 0} 6= ∅ = 0+(cl conv(S1)).

Clearly, in this case, Theorem 4.1 does not apply since Assumption (A4) does not hold.

76

Here,

X =

(z1, z2, z3)

∣∣∣∣∣∣∣∣∣∣∣∣∣

z1 − z2 + z3 ≥ 1,

−z1 + z2 + z3 ≥ 1,

z1 = z2,

z1 ≥ 0, z2 ≥ 0, z3 ≥ 0

,

which contains the ray (a, a, 1), where a > 0, whereas (a, a, 1) 6∈ S2 = cl conv(S1 ∪ S2).

We now show that concavity of tjii , vkii , and wli

i is not a severe restriction since the

convexity of a positively-homogeneous function’s upper-level set implies concavity over the

region of interest.

Proposition 4.2. If the upper-level set of a positively-homogeneous function is convex,

then the function is concave, wherever it is positive. More precisely, if W = {(z, u) |t(z, u) ≥ 1} is convex and t(z, u) is positively-homogeneous, then D = {(z, u) | t(z, u) > 0}is convex and t(z, u) is concave over D. If, in addition, cl(D) is locally simplicial or more

specially, polyhedral, and t(z, u) is continuous then t(z, u) is concave over cl(D).

Proof. If W is convex, then

WK ={(λ, x)

∣∣∣ λ > 0, x = λ(z, u), t(z, u) ≥ 1}

is the smallest convex cone containing {(1, x) | x ∈ W}. Exploiting the positive

homogeneity of t, we may rewrite WK as:

WK = {(λ, x) | λ > 0, t(x) ≥ λ} .

Now, D is the projection of WK in the space of x and is therefore convex. Further, the

hypograph of t(z, u) over D is {(r, x) | r ≤ t(x), x ∈ D} = {(r, x) | r ≤ λ ≤ t(x), λ > 0},which is convex if WK is convex. The last statement of the proposition follows from

Theorems 10.3 and 20.5 in [102].

Even when some of the technical assumptions of Theorem 4.1 are not satisfied, it

is often the case that X yields an outer-approximation of conv(S). To see this, observe

77

that Proposition 4.2 shows that the functions tjii , vkii , and wli

i are concave, if they are

positively-homogeneous, as is assumed in Theorem 4.1, and their upper-level sets are

convex. However, if concavity of these functions is known, then the outer-approximation of

conv(S) by projz X can be shown under relatively mild assumptions.

Proposition 4.3. Let S ⊆ R∑n

i=1 di and, for all i ∈ N , let Si ⊆ S. Also, suppose that

Assumption (A1) of Theorem 4.1 holds. Further, assume that projz Ai, where Ai is as

defined in (4–1), yields an outer-approximation of conv(Si) and that, for all i ∈ N , ji ∈ Ji,

ki ∈ Ki, and li ∈ Li, tjii (0, 0), v

kii (0, 0), and wli

i (0, 0) are non-negative. Then, projz(X),

where X is as defined in (4–2), outer-approximates⋃n

i=1 Si. If, in addition, Assumption

(A2) of Theorem 4.1 holds and X is convex (for example, if the functions tjii , vkii , and wli

i

are concave), then projz X ⊇ conv(S).

Proof. If Assumption (A1) is satisfied, then the sets Si, for i ∈ N , are orthogonal. It

can be easily verified that, if tjii (0, 0), vkii (0, 0), and wli

i (0, 0) are non-negative, then every

constraint defining X is valid for each Si, where i ∈ N . Therefore, projz X ⊇ ⋃ni=1 Si. If

(A2) is satisfied, conv(S) = conv (⋃n

i=1 Si). Further, if X is convex, so is projz X. Since

projz X ⊇ ⋃ni=1 Si, it follows that projz X ⊇ conv (

⋃ni=1 Si) = conv(S).

When the constituent functions tjii , vkii , and wli

i are concave, the result of Proposition

4.3 could also be derived using disjunctive programming. We verify Proposition 4.3

using this approach, since it more clearly reveals the source of the difference between the

outer-approximation of Proposition 4.3 and the convex hull identified in Theorem 4.1. For

example, one can assert that∑

i∈N tjii (zi, ui) ≥ 1, by simply noticing that if λi > 0 for

i ∈ {1, . . . , t} then:

1 =t∑

i=1

λi

≤t∑

i=1

λitjii

(ziλi

,ui

λi

)+

n∑i=t+1

tjii (zi, ui)

≤t∑

i=1

λi

(tjii

(ziλi

,ui

λi

)+

∑

i′∈N, i′ 6=i

tji′i′

(0

λi

,0

λi

))+

n∑i=t+1

tjii (zi, ui) ≤n∑

i=1

tjii (zi, ui),

(4–11)

78

where the first inequality follows by summing the inequalities λi ≤ λitjii

(ziλi, ui

λi

)for

i ∈ {1, . . . , t} and tjii (zi, ui) ≥ 0 for i ∈ {t + 1, . . . , n}, the second inequality follows

since tji′i′ (0, 0) ≥ 0, and the third inequality follows from the concavity of

∑ti=1 t

jii (zi, ui).

Similarly, it can be shown that∑t

i=1 vkii (zi, ui) ≥ −1 since −∑t

i=1 λi ≥ −1.

Proposition 4.3 provides a simple proof of the validity of the constraints defining X

for conv(S). The proof of Proposition 4.3 is similar to that of Theorem 3.1 and Remark

3.1.1 in Balas [15], although it is applied here to nonlinear inequalities. The main idea in

either case is that one can establish validity of a cut by establishing its validity for each

of the disjunctions. In fact, if the primary purpose of deriving X is to develop a convex

outer-approximation, then Proposition 4.3 can often replace Theorem 4.1. For example,

the convex hull description for the bilinear covering sets (derived in Proposition 4.9) can

be shown to yield a convex outer-approximation, if Proposition 4.3 is invoked instead of

Theorem 4.1 in the proof of the result. Nevertheless, the insights gained from Theorem 4.1

are very useful. For example, we illustrate next that the search for a representation of

conv(Si) using positively-homogeneous functions can substantially improve the relaxation.

This insight will play an important role in deriving strong relaxations for the bilinear

covering set.

Example 4.5. Consider S =⋃n

i=1 Si, where, for each i ∈ {1, . . . , n}, let Si ={L(i, zi) ∈ Rn

+

∣∣ √zi ≥ 1}. Proposition 4.3 shows that

X ′ =

{(z1, . . . , zn) ∈ Rn

+

∣∣∣∣∣n∑

i=1

√zi ≥ 1

}

is a convex outer-approximation of conv(S). Note that the square-root function used in

expressing Si is concave, but not positively-homogeneous. Instead, if Sis are represented

equivalently as

Si ={L(i, zi) ∈ Rn

+

∣∣ zi ≥ 1},

79

then Theorem 4.1 yields the convex hull of S, which is

X =

{(z1, . . . , zn) ∈ Rn

+

∣∣∣∣∣n∑

i=1

zi ≥ 1

}.

Clearly, by construction, X = conv(S) ⊆ X ′. Further, it can be seen that X ( X ′ when

n > 1 as the point(

1n2 , . . . ,

1n2

)belongs to X ′ but not to X. In this particular example,

the inclusion of X in X ′ can also be verified using the subadditivity of the square-root

function for non-negative variables. This example illustrates that it often helps to find

representations of convex hulls of Si using positively-homogeneous functions, even when

equivalent representations exist using concave functions.

As discussed in Example 4.5, if one can find a description of conv(Si) that uses

positively-homogeneous functions then one can apply Theorem 4.1 to identify the convex

hull of the orthogonal disjunctions, thus deriving a superior relaxation. Although natural

formulations of convex hulls for the orthogonal disjunctions might not use positively

homogeneous functions, the associated functions can often be transformed to satisfy this

property. Consider, for example, the case where an inequality describing the convex hull of

a restriction uses a function that is positively-homogeneous of pth order, i.e., an inequality

of the form tjii (zi, ui) ≥ 1 where, for any λi > 0, tjii (λizi, λiui) = λpi t

jii (zi, ui). Such an

inequality can be rewritten as sign(tjii (zi, ui)

) ∣∣tjii (zi, ui)∣∣ 1p ≥ 1, where sign(x) is 1 if x > 0,

is 0 if x = 0, and is −1 otherwise. Note that, in the modified form, the inequality only

uses a positively-homogeneous function and can therefore be used in the construction of

the convex hull.

More generally, a positive homogeneous description can be obtained by adding

one homogenizing variable for each orthogonal disjunction and expressing Ai using the

inequalities,

λitjii

(ziλi

,ui

λi

)≥ 1, ∀ji ∈ Ji,

λivkii

(ziλi

,ui

λi

)≥ −1, ∀ki ∈ Ki,

80

λiwlii

(ziλi

,ui

λi

)≥ 1, ∀li ∈ Li

along with the inequalities, λi ≥ 1 and −λi ≥ −1. However, this process suffers from the

drawback that it introduces new variables in the relaxation. Instead, it is possible to find a

separating inequality without increasing the problem dimension and, thereby, circumvent

the need to introduce new variables. First, we show that a separating inequality of X

can be found easily. Consider, for simplicity, the case of Theorem 4.1 where Ai is not an

extended formulation, i.e., it does not use the additional ui variables. The case where

Ai contains ui variables can be handled similarly. Now, consider a point z′ that does not

belong to cl conv(S). If it is possible to find, for all i, a function ti(zi) such that, for all zi,

ti(zi) ≥ inf{tji (zi)

∣∣ j ∈ Ji}, but ti(z

′i) = inf

{tji (z

′i)∣∣ j ∈ Ji

}, a vi(zi) such that, for all zi,

vi(zi) ≥ inf{vki (zi)

∣∣ k ∈ Ki

}, but vi(z

′i) = inf

{vki (z

′i)∣∣ k ∈ Ki

}, and a wi(zi) such that,

for all zi, wi(zi) ≥ inf{wl

i(zi)∣∣ l ∈ Li

}, but wi(z

′i) = inf l

{wl

i(z′i)∣∣ l ∈ Li

}, then using the

closed-form expression of X in (4–2), one can identify an inequality that separates z′ from

X. Observe that, when Ji, Ki and Li are finitely sized, the functions ti, vi and wi can be

found by choosing an index, j′i ∈ Ji, such that ti(zi) = tj′ii (zi), an index, k′

i ∈ Ki, such that

vi(zi) = vk′ii (zi) and an index, l′i ∈ Li, such that wi(zi) = w

l′ii (zi). Then, if an inequality

of the form∑

i∈N tjii (zi) ≥ 1 violates z′i, i.e.,∑

i∈N tjii (z′i) < 1, then

∑i∈N ti(z

′i) < 1 as

well, since, by the definition of ti, ti(z′i) ≤ tjii (z

′i) for all i. The inequality is valid since

ti(zi) ≥ 1 is valid for each Si. This is because if ti(zi) < 1 for some zi then there exists

a ji ∈ Ji such that tjii (zi) < 1. Now, observe that as long as a representation of each

Si uses positively-homogeneous functions (even if this representation requires infinitely

many inequalities), then the separation procedure described above can be used to develop

cl conv (⋃n

i=1 Si).

Consider Example 4.5 for a concrete demonstration of these ideas and, in particular,

the point(

1n2 , . . . ,

1n2

). If n > 1, this point does not belong to the convex hull, which as

shown in Example 4.5 is∑n

i=1 zi ≥ 1. Assume that we are not aware that each orthogonal

81

disjunction can be defined equivalently using zi ≥ 1, a representation that allows us to

easily identify the convex hull of the set. In the absence of such knowledge, we construct

the inequality∑n

i=1

√zi ≥ 1. Now, consider the linearization f(zi) =

1n+ n

2

(zi − 1

n2

) ≥ 1 of

√zi ≥ 1 at zi =

1n2 . Since f(zi) ≥ 1 is a relaxation of

√zi ≥ 1, we can use this inequality

to construct the outer-approximation using Theorem 4.1. The primary difference here is

that f(zi) ≥ 1, being linear, can be easily rewritten as n2zi ≥ 1 − 1

2nwhere the left hand

side is a positively homogeneous function. Then, Theorem 4.1 constructs the inequality∑n

i=1n2zi ≥ 1 − 1

2n. If n > 1, even though

∑ni=1

√zi ≥ 1 does not chop off

(1n2 , . . . ,

1n2

),

this point is cut off by the linearized inequality. Therefore, the first step of relaxation

helps tighten the inequality by exploiting positive homogeneity in the convexification step.

Observe that, in the preceding discussion, we did not find the separating inequality by

minimizing tji (zi) among all linearizations tji (zi) ≥ 1 of√zi ≥ 1. We now carry out this

procedure. We can rewrite{zi | √zi ≥ 1

}as

zi

∣∣∣∣∣∣∣∣∣∣

12√zi−zi

zi ≥ 1, ∀zi ∈ (0, 4),

1zi−2

√zizi ≥ −1, ∀zi ∈ (4,+∞),

zi ≥ 0,

,

where 12√zi−zi

zi ≥ 1 is the linearization of√zi ≥ 1 at zi for zi ∈ (0, 4) and 1

zi−2√zizi ≥ −1

is the linearization for zi ∈ (4,∞). Then, the tightest inequality is found by noticing that

12√zi−zi

is a minimized at zi = 1 for zi ∈ (0, 4) and that 1zi−2

√zi

↓ 0 (is non-negative and

approaches 0) as zi approaches +∞. Therefore, we can set ti(zi) = zi and vi(zi) = 0.

Then, we apply Theorem 2.2 to recover the convex hull of S, i.e.,

{z∣∣∣

n∑i=1

zi ≥ 1, zi ≥ 0 ∀i}.

Now, we discuss another technique that can be used to find representations of the

convex hull of each Si that uses positively-homogeneous functions but does not require

additional variables. The main idea is that one can homogenize the inequality using an

82

extra variable and then maximize the resulting function over the introduced variable to

derive a positively-homogeneous function describing the set. We illustrate this idea by

deriving a positively-homogeneous function that describes the following bilinear covering

set:

Q ={(x, y) ∈ R2

+

∣∣∣ axy + bx+ cy ≥ r}, (4–12)

where a, b, and c are assumed to be non-negative. We assume without loss of generality

that r > 0. Otherwise, Q = R2+. We may also assume without loss of generality that c ≥ b

and, consequently, assume that at least one of a and c is strictly positive. Then, for any

feasible (x, y), it follows that ax + c > 0. Therefore, Q ={(x, y) ∈ R2

+ | y ≥ r−bxax+c

}. First,

we verify that the inequality is convex. Let f(x) = r−bxax+c

. Since

∂2f

∂x2=

2a(bc+ ar)

(ax+ c)3

is nonnegative if x ≥ 0, Q is expressed as the intersection of the epigraph of a convex

function with the non-negative orthant. Therefore, Q is convex. Also, note that the

defining inequality of Q is not positively-homogeneous. We show how the above inequality

can be homogenized without introducing new variables in the formulation. To carry out

this transformation, we first homogenize the defining inequality, axy+bx+cy ≥ r, using an

additional variable h, that is restricted to be positive. This is accomplished by rewriting

the defining inequality of Q as axyh

+ bx + cy ≥ rh. Since h is positive, we can multiply

throughout by h, and express the above inequality as: axy + bxh + cyh ≥ rh2. Consider

Q′ = {(x, y, 1) | (x, y) ∈ Q}. The above positively-homogeneous inequality defines the

smallest closed convex cone that contains Q′ if h is restricted to be non-negative. Further,

if (x, y, h) satisfies the above inequality for some h ≥ 0, then for any h′ ∈ [0, h], (x, y, h′)

satisfies it as well. Therefore, Q can be described by the projection of the following set

axy + bxh+ cyh ≥ rh2 and h ≥ 1

83

in the space of the (x, y) variables. In order for (x, y, h) to satisfy the first inequality

above, h must be such that:

bx+ cy −√

(bx+ cy)2 + 4arxy

2r≤ h ≤ bx+ cy +

√(bx+ cy)2 + 4arxy

2r.

It can be easily verified that the functions bounding h are positively-homogeneous. In fact,

since the bounding functions on h are obtained from a positively-homogeneous constraint,

these functions must be positively-homogeneous. This follows because for each (x, y, h)

that satisfies a positively-homogeneous constraint and an arbitrary λ > 0, it must be that

(λx, λy, λh) satisfies the constraint as well. The lower bounding function is nonpositive.

Therefore, the set Q can be rewritten as:

η(x, y) =1

2

(bx+ cy +

√(bx+ cy)2 + 4arxy

)≥ r. (4–13)

We have thus expressed Q as the upper-level set of a positively-homogeneous function

without introducing new variables. In fact, since Proposition 4.2 asserts that a positively-

homogeneous function whose upper-level set is convex, is concave, it follows from the

convexity of Q that η(x, y) must be concave over the non-negative quadrant. In other

words, we have established the following result.

Proposition 4.4. Let Q = {(x, y) ∈ R2+ | axy + bx + cy ≥ r}, where a, b, c are

non-negative, and r is strictly positive. Then, Q has a convex description (upper level set

of a concave function) that uses positively-homogeneous functions. In particular,

Q ={(x, y) ∈ R2

+

∣∣∣ η(x, y) ≥ r},

where η(x, y) is as defined in (4–13).

4.3 Convex Extension Property

In this section, we study in more detail the convex extension property which forms

the basis for Assumption (A2) in Theorem 4.1. The convex extension property of interest

here is that the convex hull of S is determined by its restriction to certain orthogonal

84

spaces. This property clearly holds when S is defined as the union of orthogonal sets, Si,

for i ∈ {1, . . . , n}. But, it can often be established for sets that are not already defined

as such. In this section, we explore the occurrence of the convex extensions property in

these more surprising situations. For example, we show that the convex extension property

holds for mixed, pure, and continuous bilinear sets. In fact, we derive a fairly general set

of conditions that are sufficient to establish the convex extensions property. We show in

[117] that these conditions apply to large classes of polynomial covering sets and can be

used to better exploit variable bounds while constructing relaxations. We first formally

define the notion of a convex extension for orthogonal disjunctive sets. This definition is

adapted from Tawarmalani and Sahinidis [120].

Definition 4.2. Let Si ⊆ S for i ∈ N = {1, . . . , n}. We say that S has the convex

extension property for disjunctive sets Si if Assumption (A1) holds and if every point

z in S can be expressed as a convex combination of points χi in cl conv(Si) and a conic

combination of rays ψi in 0+(cl conv(Si)), i.e., for i ∈ I ⊆ N , there exist λi ≥ 0 and

µi ≥ 0, that satisfy∑

i∈I λi = 1, such that

z =∑i∈I

λiχi +∑i∈I

µiψi. (4–14)

The convex extension property in Definition 4.2 is more general than Assumption

(A2) in Theorem 4.1, in that it allows the use of non-negative multiples of recession

directions in the expression of z. Since χi +µi

λiψi ∈ cl conv(Si), it may seem that

the recession directions in (4–14) are not necessary. However, this is not true since λi

may be zero even when µi is not. This technicality is often important in applying our

result. Fortunately, it can be observed that even if Assumption (A2) is replaced with

(4–14), Theorem 4.1 holds with only slight modifications, as discussed below. Instead of

conv(S) = conv (⋃n

i=1 Si), we can only establish that (4–14) implies

cl conv(S) = cl conv

(n⋃

i=1

Si

). (4–15)

85

In fact, (4–15) is equivalent to (4–14). On the one hand, since, for each i ∈ {1, . . . , n},Si ⊆ S it follows that cl conv (

⋃ni=1 Si) ⊆ cl conv(S). On the other hand, since the sets Si

are orthogonal, by Theorem 9.8 in [102],

cl conv

(n⋃

i=1

Si

)=

⋃{λ1 cl conv(S1) + · · ·+ λn cl conv(Sn)

∣∣∣∣∣ λi ≥ 0+,n∑

i=1

λi = 1

},

(4–16)

where the notation λi ≥ 0+ means that λi cl conv(Si) is taken to be 0+ (cl conv(Si))

rather than {0} when λi = 0. Observe that (4–14) is another way to represent the

set on the right-hand-side of (4–16) since if λi > 0 then χi +µi

λiψi ∈ cl conv(Si).

Otherwise, ψi ∈ 0+ (cl conv(Si)). Now, if we assume (4–14), or equivalently, (4–15),

the proof of Theorem 4.1 shows that cl projz X = cl conv (⋃n

i=1 Si), and, therefore,

by (4–15), cl projz X = cl conv(S). In this case, Corollary 4.1 can often be used

to establish closedness of projz X. Note that projz Ai is closed whenever conv(Si) is

closed. Therefore, if conv(Si) is closed and projz Ci = 0+ (cl convSi), it follows that

projz X = cl conv(S). Since most practical situations only require the derivation of

cl conv(S), it suffices to establish (4–14) instead of Assumption (A2) in Theorem 4.1.

Similarly, if Assumption (A2) is replaced with (4–14) in Proposition 4.3, it can be easily

established that cl conv(S) ⊆ cl projz X. This is because cl conv(S) = cl conv (⋃n

i=1 Si) ⊆cl conv (projz X) = cl projz X, where the first equality follows from the equivalence of

(4–14) and (4–15), the first containment since⋃n

i=1 Si ⊆ projz X, and the last equality

since projz X is convex.

We next present a nontrivial set for which it can be proven from first principles that

the convex extension property holds for orthogonal disjunctive sets. This set appears in

a nonconvex formulation of the trim-loss problem proposed by Harjunkoski et al. [64].

The model is designed to determine the best way to cut a finite number of large rolls of

a raw material into smaller products using a certain number of cutting patterns. Let I

be the index set of products and J be the index set of the cutting patterns that are to be

chosen. The demand for a product i is known a priori and is denoted by ni,order. For each

86

(i, j) ∈ I × J , let nij ∈ Z+ be the decision variable that specifies the number of products

of type i produced in cutting pattern j and, for each j ∈ J , let mj ∈ Z+ be the number

of times the cutting pattern j is used. The following bilinear constraints model that the

demand for each product is met:

∑j∈J

mjnij ≥ ni,order, for i ∈ I. (4–17)

In Proposition 4.5, we show that the bilinear integer sets defined by the constraint (4–17)

satisfy the convex extension property for orthogonal disjunctive sets. We use this result

along with Theorem 4.1 to obtain the convex hull of integer bilinear covering sets in

Proposition 4.6.

Proposition 4.5. Consider a bilinear integer covering set

BI ={(x1, y1, x2, y2) ∈ Z2

+ × Z2+

∣∣ x1y1 + x2y2 ≥ r}.

where r > 0. Then, BI has the convex extension property (4–14) with respect to the

orthogonal disjunctive sets

BI1 =

{(x1, y1, 0, 0) ∈ Z2

+ × Z2+ | x1y1 ≥ r

},

BI2 =

{(0, 0, x2, y2) ∈ Z2

+ × Z2+ | x2y2 ≥ r

}.

Proof. Let (x1, y1, x2, y2) ∈ BI . We show that there exist (i) certain subsets I and I ′

of {1, 2}, (ii) for each i ∈ I, a finite ji, (iii) for each i ∈ I and j ∈ {1, . . . , ji}, a point

χi,j ∈ BIi , and (iv) for each i ∈ I ′, a ray ψi of B

Ii , such that

(x1, y1, x2, y2) =∑i∈I

ji∑j=1

λi,jχi,j +∑

i∈I′µiψi, (4–18)

where the multipliers are such that (a)∑

i∈I∑ji

j=1 λi,j = 1, (b) for each i ∈ I and

j ∈ {1, . . . , ji}, λi,j ≥ 0, and (c) for each i ∈ I ′, µi ≥ 0.

87

We assume without loss of generality that x1 ≤ y1 ≤ y2 and x2 ≤ y2 since the

variables x1, y1, x2, and y2 can be renamed such that the largest variable is called y2 and

the largest variable in the other pair is called y1. Note first that if x1 = 0, it suffices to

choose I = {2}, I ′ = {1}, j2 = 1 with χ2,1 = (0, 0, x2, y2) and ψ1 = (0, 1, 0, 0) to show

that (4–14) holds. Therefore, we assume in the remainder of this proof that x1 ≥ 1 and,

consequently, x1y1 ≥ 1. We consider two cases.

Case 1: x2 ≥ x1y1. In this case, we choose I = {1, 2}, I ′ = {2}, and j1 = j2 = 1. Consider

the points χ1,1 = ((y2 + 1) x1, (y2 + 1) y1, 0, 0) and χ2,1 = (0, 0, x2, y2 + 1), and the ray

ψ2 = (0, 0, 1, 0). Clearly, χ1,1 ∈ BI1 , since (y2 + 1)2x1y1 ≥ x1y1 + y22x1y1 ≥ x1y1 + y22 ≥

x1y1 + x2y2 ≥ r. Similarly, χ2,1 ∈ BI2 , since x2 (y2 + 1) ≥ x2y2 + x2 ≥ x2y2 + x1y1 ≥ r.

It is easily verified that

(x1, y1, x2, y2) =1

y2 + 1χ1,1 +

y2y2 + 1

χ2,1 +x2

y2 + 1ψ2

which shows that (4–18) is satisfied.

Case 2: x2 ≤ x1y1 − 1. In this case, we choose I = {1, 2}, I ′ = {1, 2}, j1 = 2, and

j2 = 1. Consider the points χ1,1 = (x1 + α, y1, 0, 0), χ1,2 = (x1, y1 + β, 0, 0),

χ2,1 = (0, 0, x2, y2 + δ), and the rays ψ1 = (0, 1, 0, 0), ψ2 = (0, 0, 1, 0), where

α =⌈x2y2y1

⌉, β =

⌈x2y2x1

⌉, and δ =

⌈x1y1x2

⌉. It follows from the way α, β, and δ are

defined that χ1,1 and χ1,2 belong to BI1 whereas χ2,1 belongs to BI

2 . Further, define

λ1,1 =x1y2

α(y2 + δ), λ1,2 =

αδ − x1y2α (y2 + δ)

, λ2,1 =y2

y2 + δ,

µ1 =αy1y2 + βx1y2 − αβδ

(y2 + δ)α, µ2 =

δx2

y2 + δ.

The above multipliers were computed by projecting out λ1,2, λ2,1, µ1, and µ2 from

(4–18) and the inequalities that these multipliers must satisfy. Then, λ1,1 was set

to its lowest admissible value. Instead of following this approach we show, as is

sufficient, that setting the multipliers at the above values establishes the convex

extensions property. It is easy to verify that (4–18) is satisfied. Since it is clear that

88

λ1,1 + λ1,2 + λ2,1 = 1, we only need to check that all the multipliers are non-negative.

Clearly, λ1,1 ≥ 0, λ2,1 ≥ 0 and µ2 ≥ 0. Since αδ =⌈x2y2y1

⌉ ⌈x1y1x2

⌉≥ x1y2, it follows

that λ1,2 ≥ 0. Observe that µ1 ≥ 0 if and only if αβδ ≤ αy1y2 + βx1y2. Hence, it

suffices to prove that αβδ ≤ αy1y2 + βx1y2.

We consider two cases:

Case 2.1: x2 = 1. In this case, α =⌈y2y1

⌉, β =

⌈y2x1

⌉, and δ = x1y1. There exist

fα, fβ ∈ [0, 1) such that α = y2y1

+ fα and β = y2x1

+ fβ. We observe that

αβδ =

(y2y1

+ fα

)(y2x1

+ fβ

)x1y1

= y1y2

(y2y1

+ fα

)+ x1y2

(y1y2fαfβ + fβ

)

≤ y1y2

(y2y1

+ fα

)+ x1y2

(y2x1

+ fβ

)

= αy1y2 + βx1y2

where the inequality holds because x1 ≤ y1 ≤ y2 implies x1y1fαfβ ≤ x1y1 ≤ y22.

Case 2.2: x2 ≥ 2. For (u, v) ∈ Z2+, we define l(u, v) = v − l where l is the only

integer in the interval {0, . . . , v − 1} that is such that u = qv + l for some

q ∈ Z+, i.e., l is the remainder when u is divided by v. Using this notation, it is

easy to verify that α = x2y2+l(x2y2,y1)y1

, β = x2y2+l(x2y2,x1)x1

, and δ = x1y1+l(x1y1,x2)x2

.

Now observe that:

δ

y2=

x1y1 + l (x1y1, x2)

x2y2≤ x1y1 + x2 − 1

x2y2

=x1y1x2y2

(1 +

x2 − 1

x1y1

)

≤ x1y1x2y2

(1 +

x2 − 1

x2 + 1

)

=1

x2y2

(x1y11 + 1

x2

+x1y11 + 1

x2

)

≤ 1

x2y2

(x1y1

1 + y1−1x2y2

+x1y1

1 + x1−1x2y2

)

89

≤ x1y1x2y2 + l (x2y2, y1)

+x1y1

x2y2 + l (x2y2, x1)=

x1

α+

y1β,

where the first inequality holds because l (x1y1, x2) ≤ x2 − 1, the second

inequality because x2 ≤ x1y1 − 1, the third inequality holds since y1 ≤ y2

implies y1−1y2

≤ 1 and x1 ≤ y2 implies that x1−1y2

≤ 1, and the fourth inequality

holds since y1 − 1 ≥ l (x2y2, y1) and x1 − 1 ≥ l (x2y2, x1). Therefore, αβδ ≤αy1y2 + βx1y2.

For (x1, y1, x2, y2) ∈ BI , (4–18) is satisfied, and, therefore, (4–14) holds for BI .

We now apply the result of Proposition 4.5 in conjunction with Theorem 4.1 to obtain

the following result that describes the convex hull of (4–17).

Proposition 4.6. Let

BI =

{(x, y) ∈ Zn

+ × Zn+

∣∣∣∣∣n∑

i=1

xiyi ≥ r

}, (4–19)

where r > 0 and, for each i ∈ {1, . . . , n}, define:

BIi =

{(x, y) ∈ BI

∣∣ (xj, yj) = (0, 0),∀j 6= i}.

Let the convex hull of BIi be represented by:

conv(BIi ) =

{L(i, xi, yi) ∈ Rn

+ × Rn+

∣∣ lj(xi, yi) ≥ 1,∀j ∈ J},

where lj(xi, yi) is a linear function of (xi, yi). Then,

conv(BI) =

{(x, y) ∈ Rn

+ × Rn+

∣∣∣∣∣n∑

i=1

lji(xi, yi) ≥ 1,∀(ji)ni=1 ∈n∏

i=1

J

}. (4–20)

Proof. We prove this result by applying Theorem 4.1. Let zi = (xi, yi). Assumption

(A1) holds by the definition of BIi . The convex extension property, (4–14), follows from a

sequential application of Proposition 4.5. Assumption (A3) is satisfied since the functions

lj(xi, yi) are positively-homogeneous. Further, since 0+(cl conv

(BI

i

))= Rn

+×Rn+, it follows

90

that

Ci ={L (i, xi, yi) ∈ Rn

+ × Rn+

∣∣ lj(xi, yi) ≥ 0, ∀j ∈ J} ⊆ 0+

(cl conv

(BI

i

)).

Therefore, Assumption (A4) holds. Now, by Theorem 4.1 and the discussion following

Definition 4.2, it follows that

cl conv(BI

)= X =

{(x, y) ∈ Rn

+ × Rn+

∣∣∣∣∣n∑

i=1


i=1

J

},

where the closure operation is not needed on X since it is a closed set, being an

intersection of closed half-spaces. In fact, X is polyhedral, since there are only finitely

many half-spaces in its expression. Now, consider the closed sets

BI′i =

{(x, y) ∈ Z2n

+

∣∣ xiyi ≥ r}.

Observe that BIi ⊆ BI′

i ⊆ BI . Now, by Corollary 9.8.1 in [102], conv(⋃n

i=1BI′i

)is closed.

Since

conv(BI

) ⊆ cl conv(BI

) ⊆ cl conv

(n⋃

i=1

BI′i

)= conv

(n⋃

i=1

BI′i

)⊆ conv

(BI

),

where the second containment holds since BIi ⊆ BI′

i and because the discussion following

Definition 4.2 argues that cl conv(BI) = cl conv(⋃n

i=1BIi

), the first equality since

conv(⋃n

i=1BI′i

)is closed, and the third containment since BI′

i ⊆ BI . Therefore, the

equality holds throughout, and the result follows.

Observe that, even though conv(BI) is closed, conv(⋃n

i=1BIi

)is not closed. Observe

also that, if each inequality lj(xi, yi) ≥ 1 is facet-defining for conv(BIi ), then all the

inequalities of the form∑n

i=1 lji(xi, yi) ≥ 1 are facet-defining for conv(BI). This is because,

if L(i′, x′i′ , y

′i′) is tight for l

ji′ (xi′ , yi′) ≥ 1, then it is also tight for∑n

i=1 lji(xi, yi) ≥ 1. In

other words, the inequality∑n

i=1 lji(xi, yi) ≥ 1 has two tight points for each i′ ∈ {1, . . . , n},

yielding a total of 2n tight points. Since these points belong to orthogonal subspaces and

the origin does not satisfy lji(xi, yi) ≥ 1, they are affinely independent points, showing that

91

the inequality is facet-defining. Consequently, Proposition 4.6 shows that conv(BI) has

exponentially many facets. In particular, if BIi has |J | facets, there are |J |n inequalities in

the description of conv(BI). We note, however, that separation is not difficult to perform

as the coefficients of each pair of variables can be determined independently. Since there

is an obvious pseudo-polynomial algorithm to compute the facets of conv(BIi ), it is clearly

possible to separate the facets of conv(BI) in pseudo-polynomial time.

Example 4.6. Consider the set

BI ={(x, y) ∈ Z2

+ × Z2+

∣∣ x1y1 + x2y2 ≥ 10}. (4–21)

It is easily verified that for both i ∈ {1, 2}

conv(BI

i

)=

L(i, xi, yi) ∈ R4+

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

yi ≥ 1,

10xi+ 2yi ≥ 30,

xi+ yi ≥ 7,

2xi+ 10yi ≥ 30,

xi ≥ 1

.

as presented in Figure 4-2.

It follows from Proposition 4.6 and the ensuing discussion that the convex hull of BI

has 25 nontrivial facet-defining inequalities and is represented by

conv(BI) =

(x, y) ∈ R2+ × R2

+

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

y1

515x1 + 1

15y1

17x1 +1

7y1

115x1 + 5

15y1

x1

+

y2

515x2 + 1

15y2

17x2 +1

7y2

115x2 + 5

15y2

x2

≥ 1

, (4–22)

where each pair of coefficients for (x1, y1) is matched with each pair of coefficients for

(x2, y2).

92

0 2 4 6 8 100

1

2

3

4

5

6

7

8

9

10

11

conv(BI

i)

xiyi ≥ 10

x

y

Figure 4-2. Facet-defining inequalities for conv(BI

i

)

Similarly, the convex hull characterization for a variety of bilinear sets can be

obtained using the result of Theorem 4.1. In particular, we study now the mixed integer

variant. We will study the continuous version in Proposition 4.9.


BM =

{(x, y) ∈ Zn

+ × Rn+

∣∣∣∣∣n∑

i=1

aixiyi ≥ r

}, (4–23)

where r > 0, and, for each i ∈ N , ai > 0. Define, for each i ∈ N ,

BMi =

{(x, y) ∈ BM

∣∣ (xj, yj) = (0, 0),∀j 6= i}.

Let the convex hull of BMi be represented by:

conv(BM

i

)=

{L(i, xi, yi) ∈ Rn

+ × Rn+

∣∣ lj(xi, yi) ≥ 1,∀j ∈ Ji},

where lj(xi, yi) is a linear function of (xi, yi). Then,

conv(BM

)=

{(x, y) ∈ Rn

+ × Rn+

∣∣∣∣∣n∑

i=1


i=1

Ji

}. (4–24)

93

Proof. Because the verification of the convex extension property is the only technical part

of the proof that is significantly different from that of BI , we only discuss the proof of this

property next. Because induction can be used, it suffices to prove the result when n = 2.

Let (x1, y1, x2, y2) ∈ BM . We show that there exist (i) subsets I and I ′ of {1, 2}, (ii) foreach i ∈ I, a point χi ∈ BM

i , and (iii) for each i ∈ I ′, a ray ψi of BMi , such that

(x1, y1, x2, y2) =∑i∈I

λiχi +∑

i∈I′µiψi, (4–25)

where the multipliers satisfy the following conditions: (a)∑

i∈I λi = 1, (b) for all i ∈ I,

λi ≥ 0, and (c) for all i ∈ I ′, µi ≥ 0.

We assume without loss of generality that x1y1 ≥ x2y2 since the pair of variables

(x1, y1) and (x2, y2) can be interchanged along with their respective coefficients a1 and

a2. Note that, if x2 = 0, it suffices to choose I = {1}, I ′ = {2}, χ1 = (x1, y1, 0, 0),

and ψ2 = (0, 0, 0, 1) to show that (4–14) holds. Similarly, if y2 = 0, it suffices to choose

I = {1}, I ′ = {2}, χ1 = (x1, y1, 0, 0), and ψ2 = (0, 0, 1, 0) to show that (4–14) holds.

When x1y1 ≥ x2y2 > 0, in addition to the positivity of x2 and y2, we may also assume

that x1 ≥ 1 and y1 > 0. Define χ1 =(x1, y1 +

a2x2y2a1x1

, 0, 0), χ2 =

(0, 0, x2, y2 +

a1x1y1a2x2

),

ψ1 = (x1, 0, 0, 0), and ψ2 = (0, 0, x2, 0). It can be easily verified that

(x1, y1, x2, y2) =a1x1y1

a1x1y1 + a2x2y2(χ1 + ψ2) +

a2x2y2a1x1y1 + a2x2y2

(χ2 + ψ1)

which shows that the convex extension property (4–14) holds.

Propositions 4.6 and 4.7 illustrate both the fact that the convex extension property

(4–14) holds in surprising settings and that this property might not always be trivial to

verify. We next present in Theorem 4.2 and Proposition 4.8 conditions under which the

convex extension property over orthogonal disjunctive sets can be shown to hold. These

conditions are satisfied by many polynomial covering inequalities [117] and, in particular,

the bilinear covering sets that are discussed in this section.

94

Theorem 4.2. Consider a function g(z1, . . . , zn) : R∑n

i=1 di+ 7→ R, where zi ∈ Rdi

+ , and the

set G ={z ∈ R

∑ni=1 di

+

∣∣∣ g(z1, . . . , zn) ≥ r}, where r > 0. Let Gi = G ∩ {

L(i, zi) | zi ∈ Rdi+

}

and gi(zi) = g(L(i, zi)). If there exist functions hi : Rdi+ 7→ R and f : Rn 7→ R such that:

(S1) g(z) ≤ f(h1(z1), . . . , hn(zn)), where f is a convex function,

(S2) f(y1) > f(y2) whenever y1 ≥ y2 and at least one component of y1 is larger than

the corresponding component of y2,

(S3) gi(zi) = f(L (i, hi(zi))),

(S4) For all i, hi(0) = 0 and, for λ ∈ (0, 1], λhi(ziλ) ≥ hi(zi), and

(S5) For all i, hi(zi) ≤ 0 implies that L(i, zi) ∈ 0+(cl convGi),

are satisfied over R∑n

i=1 di+ then the convex extension property, (4–14), holds for the set

G. Assume that, for each i ∈ {1, . . . , n}, conv(Gi) is closed. Define

G′i = conv(Gi) +

∑

i′ 6=i

0+(convGi′).

If, for all i, G′i ⊆ conv(G) then conv(G) is closed.

Proof. Let z ∈ G and y(z) = (h1(z1), . . . , hn(zn)). In the following, we sometimes

denote hi(zi) as yi(z) to emphasize that it is the ith component of y(z). Let T = {i |hi(zi) ≤ 0}. Then, by Assumption (S5), for each i ∈ T , L(i, zi) ∈ 0+(cl convGi). If

z − ∑i∈T L(i, zi) ∈ cl conv(

⋃ni=1Gi), then so does z. We now show that z′ = z −

∑i∈T L(i, zi) ∈ cl conv(

⋃ni=1Gi). Let δ be a subgradient of f at y(z′). Then, Assumption

(S2) implies that δ > 0. Otherwise, suppose that δi ≤ 0. Let ei denote the ith unit vector

and choose ε > 0. Observe that

f(y(z′)− εei) ≥ f(y(z′))− ε〈δ, ei〉 = f(y(z′))− εδi ≥ f(y(z′)),

a contradiction to Assumption (S2). Clearly, for each i 6∈ T , hi(z′i) = hi(zi). By

construction, for each i ∈ T , z′i = 0 and, therefore, hi(z′i) = 0 ≥ hi(zi). In other

words, yi(z′) = max{yi(z), 0}. Observe that Assumptions (S1) and (S2) together imply

that f(y(z′)) ≥ f(y(z)) ≥ g(z) ≥ r.

95

First, consider the case where 〈δ, y(z′)〉 = 0. Then, y(z′) = 0 since we have just

proven that y(z′) ≥ 0 and δ > 0. Observe now that y(z′) = 0 implies that z′ = 0. This

is because if hi(z′i) = 0, then hi(zi) ≤ 0. Therefore, i ∈ T and so z′i = 0. In other words,

g(z′) = g(0) = f(y(z′)) ≥ r, where the second equality follows from Assumption (S3). We

have thus shown that z′ = 0 ∈ Gi for each i. Clearly, z′ ∈ cl conv(⋃n

i=1Gi).

Now, consider the case when 〈δ, y(z′)〉 > 0. For i = 1, . . . , n, define λi = δiyi(z′)

〈δ,y(z′)〉 .

Since δi and yi(z′) are non-negative, it follows that λi ≥ 0. Further,

∑ni=1 λi = 1. Define

I = {i | λi > 0} and observe that |I| ≥ 1. The following chain of implication holds

i 6∈ I ⇒ yi(z′) = 0 ⇒ i ∈ T ⇒ z′i = 0,

where the first implication follows since δi > 0; the second because, for each i 6∈ T ,

yi(z′) > 0; and the third by the construction of z′. Therefore, z′ =

∑i∈I z

′′i , where

z′′i = L(i, z′i). For each i ∈ I, let χi =z′′iλi. Observe that z′ =

∑i∈I λiχi, i.e., z

′ can be

expressed as a convex combination of χi for i ∈ I. The following shows that, for all i ∈ I,

χi ∈ Gi:

g(χi) = gi

(z′iλi

)= f(y(χi)) ≥ f

(1

λi

y(z′′i ))

≥ f(y(z′)) + δi〈δ, y(z′)〉δiyi(z′)

yi(z′′)−

n∑j=1

δjyj(z′)

= f(y(z′)) + δi〈δ, y(z′)〉δiyi(z′)

yi(z′)−

n∑j=1

δjyj(z′)

= f(y(z′)) ≥ r.

The first equality follows from the definition of gi, the second equality from Assumption

(S3), the first inequality follows since f is non-decreasing by Assumption (S2) and

hi(z′iλi) ≥ 1

λihi(z

′i), the second inequality because δ is a subgradient of f at y(z′), and the

third equality because yi(z′′) = hi(z

′i) = yi(z

′). Since z = z′ +∑

i∈T L(i, zi), where, for each

i ∈ T , L(i, zi) ∈ 0+ (cl conv(Gi)) it follows that (4–14) holds for G.

96

We now prove the last statement of the theorem. Consider an arbitrary i ∈ N .

Clearly, G′i, as defined in the statement of the theorem, is convex. We argue that it

is also closed. By Corollary 9.1.1 in [102], G′i is closed if there do not exist L(i, zi) ∈

conv(Gi) and, for i′ ∈ N\{i}, L(0, zi′ , 0) ∈ 0+ (convGi′), not all zero, such that L(i, zi) +

∑i′∈N\{i} L(i′, zi′) = 0. But, the vectors L(i, zi) and L(i′, zi′) for i′ ∈ N\{i} are orthogonal.

Therefore, they sum to zero if and only if each of the vectors is zero. It follows that

G′i is closed. Again by Corollary 9.1.1 in [102], 0+(G′

i) =∑n

i=1 0+ (convGi). Since the

recession directions of G′i are independent of i, it follows by Corollary 9.8.1 in [102] that

conv (⋃n

i=1G′i) is closed. Now,

conv(G) ⊆ cl conv(G) = cl conv

(n⋃

i=1

Gi

)⊆ cl conv

(n⋃

i=1

G′i

)= conv

(n⋃

i=1

G′i

)⊆ conv(G),

where the first equality follows from the equivalence of (4–14) and (4–15), the second

containment follows since Gi ⊆ G′i, the second equality follows since conv (

⋃ni=1G

′i) is

closed and the third containment follows since G′i ⊆ conv(G).

The main challenge in applying Theorem 4.2 in practical situations is verifying

Assumption (S4). However, when hi(zi) is derived from other functions using operations

such as summations, minimizations, or maximizations, then Assumption (S4) can often be

established easily by studying the same properties for the functions used in the derivation

of hi(zi). To see this, first note that the assumption is satisfied trivially by any linear

function or, more generally, for a positively-homogeneous function of rth order, where r ≥1. For a more elaborate illustration, consider the case where h(z) = w (p1(z), . . . , pK(z)),

for all k ∈ {1, . . . , K}, pk(z) satisfies Assumption (S4), w satisfies Assumption (S4), w is

isotonic, i.e., w(y1) ≥ w(y2) if y1 ≥ y2, and w(0, . . . , 0) = 0. We claim that h(z) satisfies

Assumption (S4). Clearly, h(0) = w(p1(0), . . . , pK(0)) = w(0, . . . , 0) = 0 and

λh(zλ

)= λw

(p1

(zλ

), . . . , pK

(zλ

))≥ λw

(1

λp1(z), . . . ,

1

λpK(z)

)

≥ w (p1(z), . . . , pK(z)) = h(z),

97

where the first inequality follows since w is isotonic and pk(z) obeys Assumption (S4);

and the second inequality because w obeys Assumption (S4). If w satisfies Assumption

(S4) only over the non-negative orthant, then pk(z) must be non-negative as well. In

particular,∑K

k=1 pk(z) satisfies the assumption as long as, for all k, pk(z) satisfies the

assumption. For another illustration, consider now h(z) = opy p(y, z), where op is an

operator such as min or max that satisfies opy f1(y) ≥ opy f2(y) if, for all y, f1(y) ≥ f2(y)

and λ opy f(y) ≥ opy λf(y) for λ ∈ (0, 1]. In addition, assume that λp(y, z

λ

) ≥ p(y, z) for

λ ∈ (0, 1]. Then,

λh( zλ

)= λ op

yp(y,

z

λ

)≥ λ op

y

1

λp(y, z) ≥ op

yp(y, z) = h(z),

for λ ∈ (0, 1]. In particular, if h(z) = min(p1(z), . . . , pK(z)) and, for all λ ∈ (0, 1] and

k ∈ {1, . . . , K}, pk(z) ≤ λpk(zλ

)then h(z) ≤ λh

(zλ

).

The following corollary of Theorem 4.2 discusses the case where f is the summation

operator and hi(zi) = gi(zi). Subsequently, we will use Corollary 4.2 to show that the

convex extensions property holds for bilinear covering sets. In [117], we use this result to

show that the property also holds for more general polynomial covering sets. Moreover,

Corollary 4.2 shows that conv(G) is closed if the function g(·) eventually increases in each

one of the principal directions of the non-negative orthant.

Corollary 4.2. Consider a function g(z1, . . . , zn) : R∑n

i=1 di+ 7→ R, where zi ∈ Rdi

+ , and the

set G ={z ∈ R

∑ni=1 di

+

∣∣∣ g(z1, . . . , zn) ≥ r}, where r > 0. Let Gi = G ∩ {

L(i, zi) | zi ∈ Rdi+

}

and gi(zi) = g(L(i, zi)). If

(B1) g(z) ≤ ∑ni=1 gi(zi),

(B2) For all i, gi(0) = 0 and, for λ ∈ (0, 1], λgi(ziλ) ≥ gi(zi), and

(B3) For all i, gi(zi) ≤ 0 implies that L(i, zi) ∈ 0+(cl convGi),

are satisfied over R∑n

i=1 di+ then the convex extension property, (4–14), holds for the set

G. Let edi ∈ R∑n

j=1 dj whose d +∑

j<i dj indexed component is one and the rest are zero.

Assume that, for all i, conv(Gi) is closed. Assume further that there exists γ such that, for

98

all γ′ ≥ γ, i ∈ N , d ∈ {1, . . . , di}, and z ≥ 0, it holds that g(z + γ′edi

) ≥ g(z). Then,

conv(G) is closed.

Proof. Choose f to be the summation operator and hi(zi) = gi(zi). Then, the first part

of the result follows from Theorem 4.2. The rest of the result follows if G′i, as defined

in the statement of Theorem 4.2, is contained in conv(G). Consider a z which can be

expressed as zi +∑

i′ 6=i L(i′, zi′), where zi ∈ conv (Gi) and for all i′ 6= i, zi′ ≥ 0. By

Caratheodory’s theorem, there exist, for d ∈ {1, . . . , di + 1}, zd and λd ≥ 0, such that∑di+1

d=1 λdzd = zi,

∑di+1d=1 λd = 1, and zd ∈ Gi for all d. Let D =

∑i′ 6=i di′ . Then, define

m = min {zi′dD | i′ 6= i, d = 1, . . . , di′ , zi′d > 0} and m′ = max{1, γ

m

}. For each i′ 6= i

and d′ ∈ {1, . . . , di′}, define zdd′

i′ = zd + Dm′zi′d′ed′

i′ . On the one hand, for all (i′, d′) with

zi′d′ > 0, it follows that Dm′zi′d′ ≥ γ. Therefore, g(zdd

′i′

) ≥ g(zd) ≥ r and, so, zdd

′i′ ∈ G.

On the other hand, if zi′d′ = 0 then zdd′

i′ = zd ∈ G. It follows that zdd′

i′ ∈ G for all (i′, d, d′).

Now, z can be written as a convex combination of points in G as follows:

di+1∑

d=1

λd

(1− 1

m′

)zd + λd

1

Dm′∑

i′ 6=i

di′∑

d′=1

zdd′

i′

=

di+1∑

d=1

λdzd +

di+1∑

d=1

λd

∑

i′ 6=i

di′∑

d′=1

zi′d′ed′i′ = zi +

∑

i′ 6=i

L(i′, zi′).

Observe that the multipliers are non-negative since m′ ≥ 1 and

di+1∑

d=1

λd

1− 1

m′ +1

Dm′∑

i′ 6=i

di′∑

d=1

1

= 1.

Therefore, the result follows.

Theorem 4.1 also points to an interesting set of sufficient conditions that can be

used to verify the convex extension property. The primary difference from the conditions

in Theorem 4.2 is that Proposition 4.8 does not impose a structure on the original set

S. Instead, it constructs a set X whose projection in the z-space is contained within

cl conv (⋃n

i=1 Si), using a construction similar to Theorem 4.1, and then leaves it to the

99

user to verify that X outer-approximates S. This technique may be useful when S is

defined by more than one inequality. Also, note that the special case of Theorem 4.2,

discussed in Corollary 4.2, also follows from Proposition 4.8.

Proposition 4.8. For a set S and its subsets Si ⊆ S for i ∈ N = {1, . . . , n}, let zi ∈ Rdi

and z = (z1, . . . , zi, . . . , zn) ∈ S ⊆ R∑

i di. Assume that Assumptions (A1) and (A4)

are satisfied as in Theorem 4.1 and the sets Ai and X are as defined in (4–1) and (4–2)

respectively. If, in addition, the following assumptions are satisfied:

(N1) Si ⊆ projz Ai ⊆ cl conv(Si),

(N2) tjii , vkii , and wli

i are such that for all 0 < λ ≤ 1,

λtjii

(ziλ,ui

λ

)≥ tjii (zi, ui), λv

kii

(ziλ,ui

λ

)≥ vkii (zi, ui), λw

lii

(ziλ,ui

λ

)≥ wli

i (zi, ui),

(N3) S ⊆ cl conv(projz X).

Then, (4–14) holds for S.

Proof. Lemma 4.2 shows that X = proj(z,u)Q. We now show that projz X = projz Q ⊆cl conv (

⋃ni=1 Si). The proof is again similar to that for Lemma 4.1 except that the positive

homogeneity is replaced by the weaker inequalities assumed in Assumption (N2). Even

then, if (λ, z, u) ∈ Q and 0 < λi ≤ 1, it follows that(

ziλi, ui

λi

)∈ Ri(1) since the defining

inequalities are satisfied as follows:

tjii (zi, ui) ≥ λi and λitjii

(ziλi

,ui

λi

)≥ tjii (zi, ui)

⇒ λitjii

(ziλi

,ui

λi

)≥ tjii (zi, ui) ≥ λi

⇒ tjii

(ziλi

,ui

λi

)≥ 1.

Clearly, cl conv (⋃n

i=1 Si) ⊆ cl conv(S) and we have assumed that S ⊆ cl conv(projz X).

Observe that cl conv(S) ⊆ cl conv(projz X) ⊆ cl conv (⋃n

i=1 Si) ⊆ cl conv(S) and, therefore,

equality holds throughout.

100

Observe that Assumptions (N1) and (N2) are less restrictive than Assumption

(A3) in Theorem 4.1 since projz Ai may be a nonconvex subset of conv(Si) and the

positive homogeneity is relaxed. Here, it is not necessary to use tjii (zi, ui), vkii (zi, ui), and

wlii (zi, ui) as the underestimators in Assumption (N2). Rather, any function of (zi, ui) that

underestimates λitjii

(ziλi, ui

λi

), λiv

kii

(ziλi, ui

λi

), and λiw

lii

(ziλi, ui

λi

)for all λi ∈ (0, 1] suffices.

As long as the set Ci defined using these functions inner-approximates the recession

cone of cl conv(S), a suitable set X can be derived by projecting out the λ variables and

Assumption (N3) can be posed in terms of this set.

We now discuss the application of Corollary 4.2 to convexifying bilinear covering sets.

The bilinear covering sets that we shall now consider generalize the bilinear set discussed

in Proposition 4.4. In fact, the bilinear covering set reduces to Q, as defined in (4–12)

when restricted to any one of n orthogonal subspaces. As long as the convex extension

property holds, since Proposition 4.4 provides the defining inequality for the convex hull

in each of the orthogonal subspaces, we can use Theorem 4.1 to find the convex hull

description of the bilinear covering set over the non-negative orthant. We formalize this

argument in the following proposition.

Proposition 4.9. Consider the bilinear covering set:

BR =

{(x, y) ∈ Rn

+ × Rn+

∣∣∣n∑

i=1

(aixiyi + bixi + ciyi) ≥ r

}

where, for each i ∈ {1, . . . , n}, ai, bi, and ci are non-negative and r is strictly positive. Let

ηi(xi, yi) =1

2

(bixi + ciyi +

√(bixi + ciyi)2 + 4airxiyi

).

Then,

conv(BR) = X =

{(x, y) ∈ Rn

+ × Rn+

∣∣∣n∑

i=1

ηi(xi, yi) ≥ r

}. (4–26)

Proof. We may assume without loss of generality that, for each i, at least one of ai, bi,

or ci is positive. First, we use Corollary 4.2 to show that the convex extension property

(4–14) holds for BR. Let zi = (xi, yi) and gi(zi) = aixiyi + bixi + ciyi. Clearly, gi(0) = 0

101

and for 0 < λ ≤ 1,

λgi

(ziλ

)=

aixiyiλ

+ bixi + ciyi ≥ gi(zi).

Therefore, Assumption (B2) is satisfied. Let

BRi =

{L (i, xi, yi) ∈ R2n

+

∣∣ gi(xi, yi) ≥ 0}.

Observe that, if (x′i, y

′i) ≥ 0, then gi(xi+x′

i, yi+y′i) ≥ gi(xi, yi). Therefore, if z′i = (x′

i, y′i) ≥ 0

then (0, z′i, 0) ∈ 0+(cl convBRi ) and, consequently, Assumption (B3) is satisfied. It follows

that the convex extension property holds for BR. In fact, since g is non-decreasing and

cl conv(BRi ) = BR

i it follows from the last statement of Corollary 4.2 that conv(BR) is

closed as well. By Proposition 4.4, it follows that the convex hull of BRi is defined by

ηi(xi, yi) ≥ r. Observe that ηi(xi, yi) is a positively-homogeneous function. Therefore,

Assumption (A3) is satisfied. Finally, ηi(xi, yi) is concave by Proposition 4.2 and since for

sufficiently large zi, gi(xi, yi) ≥ r, it follows that BRi 6= ∅ and, therefore, by Proposition 4.1

that Assumption (A4) is satisfied. Then, by Theorem 4.1 and the discussion following

Definition 4.2, the set X in (4–26) is cl conv(BR). But, as argued earlier, cl conv(BR) =

conv(BR), and the result follows.

Consider the special case of Proposition 4.9 where bi = ci = 0. In this case, the convex

hull inequality takes the following simple form:∑n

i=1

√aixiyi ≥

√r. First, the validity of

the inequality can be verified using the following argument:

n∑i=1

√aixiyi ≥

√√√√n∑

i=1

aixiyi ≥√r,

where the first inequality follows by the subadditivity of square-root over non-negative

real numbers. Second, by Example 4.4, the above inequality defines the closure convex

hull of the disjunctive union of {(xi, yi) | aixiyi ≥ r} over the non-negative orthant and,

therefore, it must also be the closure convex hull of∑n

i=1 aixiyi ≥ r over the same set.

Note that we did not employ Theorem 4.2 in the argument. Instead, we replaced it with

102

a proof that the convex hull of the disjunctive union of orthogonal restrictions of the

set includes the original set. This illustrates a different technique, similar to the proof

technique of Proposition 4.8, that may sometimes be useful in establishing the convex

extension property.

However, the above technique for establishing validity fails for another special case of

Proposition 4.9, where the defining inequality is ax1y1 + bx2 ≥ r with a > 0, b > 0, and

r > 0. A simpler variant of this set was mentioned in the introduction of this section. By

Proposition 4.9, its convex hull over the non-negative orthant is defined by

√ax1y1r

+bx2

r≥ 1. (4–27)

Note that the right-hand-side r participates differently with different subsets of variables

in this convex hull inequality. One could use subadditivity of the square-root function to

instead derive the following valid inequality

√ax1y1r

+

√bx2

r≥ 1. (4–28)

However, as expected, (4–28) is not as tight as (4–27). This can be seen by considering a

point (x1, y1, x2) that is feasible to (4–27). If bx2

r≥ 1, it follows that

√bx2

r≥ 1. Otherwise,

bx2

r< 1, in which case

√ax1y1r

+

√bx2

r>

√ax1y1r

+bx2

r≥ 1.

Therefore, (x1, y1, x2) is feasible to (4–28) as well. Observe that the subadditivity of

the square-root function is not sufficient to prove the convex extension property for this

bilinear covering set, and, thus, cannot replace Theorem 4.2. Without realizing the convex

extension property a priori, even the form of the inequality (4–27) is not obvious. The key

to deriving this convex hull is thus to realize that the convex hull is formed by restricting

attention to orthogonal subspaces. The first subspace spans the (x1, y1) variables and the

second subspace spans x2. Then, Theorem 4.1 quickly reveals the structure of the convex

103

hull. Here,√

bx2

r≥ 1 as well as bx2

r≥ 1 define the convex hull of the set restricted to

(0, 0, x2). However, as the insight from Theorem 4.1 suggests, it is preferable to choose the

latter representation since it uses a positively-homogeneous function.

The construction of Proposition 4.9 can be carried out as long as it is possible to

invoke Theorem 4.2 to establish the convex extension property and Theorem 4.1 to

convexify the orthogonal disjunctions. This idea can be exploited to develop tighter

relaxations when the variables are restricted to belong to the hypercube by suitably

altering the inequality outside the hypercube so that Theorem 4.2 can still be used.

This technique for deriving relaxations is pursued in greater detail in [117]. Also, since

the relaxations developed arise from orthogonal disjunctions, their geometry is easy

to understand by studying the orthogonal subspaces. This idea is exploited in [117] to

show that the resulting relaxation for bilinear covering sets is tighter than the standard

factorable relaxation used in current nonlinear branch-and-bound solvers. A preliminary

computational study is also reported in [117] that shows that not only is the relaxation

guaranteed to be at least as tight as the factorable relaxation, but that the improvement is

substantial.

4.4 Concluding Remarks

In this chapter, we developed a convexification tool for orthogonal disjunctive sets.

The convexification is obtained in the space of original variables. As an application,

we provided a simple derivation of split cuts for mixed-integer polyhedral sets. We also

showed that the convexification tool is useful in deriving cuts for a variety of nonconvex

constraints; those that satisfy a key convex extension property. We provided a general set

of conditions that are sufficient to establish the convex extension property. We illustrated

the techniques by finding the convex hull representation for bilinear covering sets. The

convex extension property holds for a variety of polynomial covering sets and thereby our

convexification tool provides convex hull representations of many sets; see [117]. If the

variables are restricted to be in a hypercube, the results of this section motivate strategies

104

for exploiting this information that are often superior to their counterparts currently used

in most nonlinear branch-and-bound solvers; see [117].

105

CHAPTER 5LIFTED INEQUALITIES FOR 0-1 MIXED-INTEGER BILINEAR COVERING SETS 1

5.1 Introduction

In Chapter 4, we showed that for certain bilinear covering sets, relaxations stronger

than McCormick’s can be obtained by considering the right-hand-side. In particular, we

have derived closed-form expressions for the convex hull of∑

j∈N ajxjyj ≥ d over the

nonnegative orthant. However, these results are obtained under the assumption that

variables in (3–1) do not have upper bounds.

In this chapter, we study the convex hull of these sets when variables are bounded. In

particular, we consider 0−1 mixed-integer bilinear covering sets of the form

B =

{(x, y) ∈ {0, 1}n × [0, 1]n

∣∣∣n∑

j=1

ajxjyj ≥ d

},

where n ∈ Z+, aj > 0 ∀j ∈ N := {1, . . . , n}, and d > 0. In order to guarantee that B is not

empty, we impose that

Assumption 5.1.∑n

j=1 aj ≥ d.

On the theoretical side, we are interested in studying relaxation techniques for B that

will take both the right-hand-side d and upper bounds on the variables into account. On

the practical side, we are also interested in studying B because of its relations with some

important mixed-integer linear sets. In particular, since the set B is a relaxation of the

single-node flow set without outflows

F =

{(x, y) ∈ {0, 1}n × [0, 1]n

∣∣∣n∑

j=1

ajyj ≥ d, xj ≥ yj ∀j ∈ N

},

valid inequalities for B will also be valid for F . Further, we will show that the derivation

of strong linear inequalities valid for B yields new families of facet-defining inequalities for

the convex hull of F , thereby providing new approaches to study these sets.


106

We next show that it will typically be difficult to find globally optimal solutions to

problems containing B as a constraint by showing that it is NP-hard to optimize a linear

function over B. This also suggests that finding a closed-form expression for the convex

hull of B is likely to be difficult. To this end, consider the following optimization problem

(P ) that seeks to minimize a linear objective function over the bilinear set B:

(P ) minn∑

j=1

bjxj +n∑

j=1

cjyj

s.t. (x, y) ∈ B,

where b ∈ Rn and c ∈ Rn. We show next that (P ) is NP-hard.

Proposition 5.1. Problem (P ) is NP-hard.

Proof. The proof is by reduction from the 0−1 knapsack problem. Consider the problem

(K) defined as:

min

{n∑

j=1

bjxj

∣∣∣n∑

j=1

ajxj ≥ d, xj ∈ {0, 1} ∀j ∈ N

}.

Problem (K) is an instance of the 0−1 knapsack problem that is proven to be NP-hard

in [56]. Now, we show that (K) is polynomially reducible to (P ). Consider an instance of

(K). We define a corresponding instance of (P ) by setting cj = 0 for all j ∈ N . Then, (P )

can be rewritten as:

min

{n∑

j=1

bjxj

∣∣∣n∑

j=1

ajxjyj ≥ d, xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ N

}.

Clearly, there exists an optimal solution (x∗, y∗) of (P ) with y∗j = 1 for all j ∈ N

since the corresponding objective coefficients are zero. Using the fact that∑n

j=1 ajx∗j ≥

∑nj=1 ajx

∗jy

∗j ≥ d, it can easily be verified that x∗ is an optimal solution for (P ). Therefore,

(K) is polynomially reducible to (P ), which proves the result.

In this chapter, we focus on constructing strong cutting planes for optimization

problems containing the constraints of B by studying the convex hull of B. Throughout

107

the chapter, we will denote conv(B) by PB. Because B can be expressed as a finite union

of polytopes, PB is polyhedral.

Proposition 5.2. PB is a polytope.

Therefore, when studying PB, it is sufficient to consider linear inequalities. To

construct these inequalities, we will use lifting. Lifting is a well-known integer programming

technique that generates strong inequalities by transforming an inequality valid for

a restricted subset of the feasible region into a globally valid constraint. Early work

on lifting in integer programming can be found in Wolsey [131, 132]. A generalization

to nonlinear programming can be found in Richard and Tawarmalani [100]. Using

sequence-independent lifting techniques, we derive large families of facet-defining

inequalities, which can be used as strong cutting planes in branch-and-cut framework.

This illustrates a new way of using the bounds on variables in the generation of cuts

in MINLP. Further, the results have implications for flow models in mixed-integer

programming, a family of problems that are important both theoretically and practically.

This chapter is structured as follows. In Section 5.2, we derive basic polyhedral results

about PB. We provide necessary and sufficient conditions for some trivial inequalities

to be facet-defining. Then, we derive a linear description of PB for the special case

where n = 2. This result is used to identify the seed inequalities that will be used in

lifting procedures. In Section 5.3, we derive three families of closed-form facet-defining

inequalities for PB using sequence-independent lifting techniques. One requires the use

of a subadditive approximation of the lifting function. In Section 5.4, we show that the

lifted inequalities developed for PB generalize known families of cuts and yield new

facet-defining inequalities for the single-node flow set F without outflows. We summarize

the contributions of our work and conclude with remarks on future research directions in

Section 5.5.

108

5.2 Basic Polyhedral Results

In this section, we develop basic results about the polyhedral structure of PB. First,

we provide necessary and sufficient conditions for PB to be full-dimensional.

Proposition 5.3. PB is a full-dimensional polytope if and only if∑n

j=1 aj ≥ d+ ai for all

i ∈ N .

Proof. First, we show that if∑n

j=1 aj ≥ d + ai for all i ∈ N , then PB is a full-dimensional

polytope. For all i ∈ N , construct pi = (1 − ei,1) and qi = (1,1 − ei). Further,

select r = (1,1). It follows that pi, qi, and r belong to B. Further, these points are

affinely independent because r − pi and r − qi for all i ∈ N are linearly independent.

Since we have described 2n + 1 affinely independent points in PB, we have shown that

PB is full-dimensional. Next, we prove if PB is a full-dimensional polyhedron, then∑n

j=1 aj ≥ d + ai for all i ∈ N . Assume by contradiction that∑n

j=1 aj < d + ai for

some i ∈ N . Since∑n

j=1 aj ≥ d from Assumption 5.1, we must have that xi = 1 in

every feasible solution of B, showing that PB is not full-dimensional. This is the desired

contradiction.

In the remaining of this chapter, we will assume that PB is full-dimensional.

Assumption 5.2.∑n

j=1 aj ≥ d+ ai for all i ∈ N .

Observe that Assumption 5.2 strictly dominates Assumption 5.1. We next identify

some basic properties that all facets of PB must satisfy.

Proposition 5.4. Letn∑

j=1

αjxj +n∑

j=1

βjyj ≥ δ (5–1)

be a facet-defining inequality for PB that is not a scalar multiple of xj ≤ 1 for j ∈ N or

yj ≤ 1 for j ∈ N . Then, αj ≥ 0, ∀j ∈ N , βj ≥ 0, ∀j ∈ N , and δ ≥ 0.

Proof. Select an arbitrary element i ∈ N . Since (5–1) is facet-defining for PB, there exists

(x∗, y∗) ∈ B such thatn∑

j=1

αjx∗j +

n∑j=1

βjy∗j = δ, (5–2)

109

Since (5–1) is not a scalar multiple of xi ≤ 1, it is clear that x∗i < 1. Consider now

(x, y) = (x∗, y∗) + (1− x∗i )ei. This point belongs to B and therefore, satisfies (5–1), i.e.,

n∑j=1

αjxj +n∑

j=1

βj yj ≥ δ. (5–3)

Subtracting (5–2) from (5–3), we obtain that αi ≥ 0. The proof that βi ≥ 0 for all

i ∈ N is similar. The fact that δ ≥ 0 follows from (5–2) after noting that all terms in the

left-hand-side are nonnegative.

The following proposition further studies facet-defining inequalities whose right-hand-sides

are zero.

Proposition 5.5. Letn∑

j=1

αjxj +n∑

j=1

βjyj ≥ 0 (5–4)

be a facet-defining inequality for PB. Then, (5–4) is a scalar multiple of xj ≥ 0 for j ∈ N

or of yj ≥ 0 for j ∈ N .

Proof. Assume for a contradiction that (5–4) is not a scalar multiple of xi ≥ 0 for i ∈ N or

of yk ≥ 0 for k ∈ N . Select i ∈ N . Then, there exists (xi, yi) ∈ B such that xii > 0 and for

whichn∑

j=1

αjxij +

n∑j=1

βjyij = 0. (5–5)

Because (5–4) is not a scalar multiple of xi ≥ 0 for i ∈ N or of yk ≥ 0 for k ∈ N as its

right-hand-side is equal to 0, it follows from Proposition 5.4 that αj ≥ 0 and βj ≥ 0 for all

j ∈ N . We obtain that

0 =n∑

j=1

αjxij +

n∑j=1

βjyij ≥ αix

ii ≥ 0. (5–6)

We conclude that αi = 0 since xii > 0. Similarly, we can establish that βk = 0 ∀k ∈ N .

This is a desired contradiction to the fact that (5–4) is facet-defining for PB.

Now, we characterize some simple facets of PB that play an important role in

Propositions 5.4 and 5.5.

110

Proposition 5.6. The upper bound inequalities xi ≤ 1, yi ≤ 1 are facet-defining for PB

for all i ∈ N . Further, for i ∈ N , the lower bound inequalities xi ≥ 0, yi ≥ 0 are facet-

defining for PB if and only if∑n

j=1 aj−ai−al(i) ≥ d where l(i) ∈ argmax{aj | j ∈ N \{i}}.

Proof. The validity of all these inequalities is trivial since they belong to the description

of B. To prove that xi ≤ 1 is facet-defining, we show 2n affinely independent points in B

satisfying xi = 1. First, we construct the n points pk = (1,1 − ek) for k ∈ N . Next, we

build the n− 1 points qk = (1− ek,1) for k ∈ N \ {i}. Finally, we select r = (1,1). These

points are affinely independent since r − pk and r − qk are linearly independent. Therefore,

xi ≤ 1 is facet-defining for PB. The proof that yi ≤ 1 is facet-defining for PB is similar.

Now, we show that xi ≥ 0 is facet-defining if∑n

j=1 aj − ai − al(i) ≥ d by constructing

2n affinely independent points in B satisfying xi = 0. For k ∈ N \ {i}, we construct the

2(n−1) points, pk = (1−ei−ek,1−ei−ek) and qk = (1−ei−ek,1−ei). Finally, we add the

two points r1 = (1− ei,1− ei) and r2 = (1− ei,1). Clearly, for any k ∈ N \ {i}, pk, qk, r1

and r2 are feasible since∑n

j=1 aj − ai − ak ≥∑n

j=1 aj − ai − al(i) ≥ d for k ∈ N \ {i}. Thesepoints are affinely independent since qk − pk, r1 − qk, and r2 − r1 are linearly independent.

To prove the reverse direction, assume now that xi ≥ 0 is facet-defining for PB. We claim

that∑n

j=1 aj − ai − al(i) ≥ d. Assume for a contradiction that∑n

j=1 aj − ai − al(i) < d. It

follows that xi + xl(i) ≥ 1 is valid for PB. Since −xl(i) ≥ −1, we obtain that xi ≥ 0 is a

convex combination of two other valid inequalities, showing it is not facet-defining. This

is a contradiction to the fact that xi ≥ 0 is facet-defining. Similarly, it can be proven that

yi ≥ 0 is facet-defining for PB if∑n

j=1 aj − ai − al(i) ≥ d.

Observe that the above proofs are also valid for yi ∈ {0, 1} instead of yi ∈ [0, 1] for

some subset J ⊆ N . We next study another simple facet-defining inequality for PB.

Proposition 5.7. The inequality∑n

j=1 ajyj ≥ d is facet-defining for PB.

Proof. Validity is easily verified since∑n

j=1 ajyj ≥ ∑nj=1 ajxjyj ≥ d. To prove that

∑nj=1 ajyj ≥ d is facet-defining, we present 2n points (xi, yi) in B that satisfy

∑nj=1 ajyj ≥

111

d at equality and such that the system αxi + βyi = δ for i = 1, . . . , 2n only has solutions

(α, β, δ) that are scalar multiples of (0, a, d). Consider the 2n points pk = (1,∆k(1 − ek))

and qk = (1 − ek,∆k(1 − ek)) where ∆k = d∑nj=1 aj−ak

for k ∈ N . Note that because of

Assumption 5.2, 0 < ∆k ≤ 1 for all k ∈ N . Clearly, pk and qk belong to B and satisfy∑n

j=1 ajyj ≥ d at equality. These 2n points yield the system:

n∑j=1

αj +∆k

(n∑

j=1

βj − βk

)= δ ∀k ∈ N, (5–7)

n∑j=1

αj − αk +∆k

(n∑

j=1

βj − βk

)= δ ∀k ∈ N. (5–8)

By subtracting (5–7) from (5–8), we obtain that αk = 0 for k ∈ N . From (5–7), we then

conclude that, for all k, l ∈ N ,

n∑j=1

βj − βk =δ

d

(n∑

j=1

aj − ak

)and

n∑j=1

βj − βl =δ

d

(n∑

j=1

aj − al

).

This implies that βk − δdak = βl − δ

dal. After letting βk − δ

dak = θ and plugging these values

in (5–7), we obtain that θ = 0, which implies βk = δdak for k ∈ N . Therefore, we conclude

that all solutions (α, β, δ) to (5–7) and (5–8) are scalar multiples of (0, a, d).

In the remainder of the text, we will often use the term facet to refer to a facet-defining

inequality. We will also refer to inequalities xi ≤ 1, yi ≤ 1, and∑n

j=1 ajyj ≥ d as trivial

facets of PB. To illustrate the richness of the polyhedral structure of PB, we present a

simple example next. The linear inequalities describing the convex hull of this set were

obtained using PORTA; see Christof and Lobel [32].

Example 5.1. Consider the 0−1 mixed-integer bilinear covering set

B ={(x, y) ∈ {0, 1}4 × [0, 1]4

∣∣∣ 19x1y1 + 17x2y2 + 15x3y3 + 10x4y4 ≥ 20}.

The linear description of PB has 58 inequalities that are presented in the Appendix. A

subset of the inequalities are:

50x1 +90x3 +45x4 +76y1 +153y2 ≥ 135 (5–9)

112

70x1 +90x2 +27x4 +38y1 +135y3 ≥ 117 (5–10)

19x1 +17x2 +15y3 +10y4 ≥ 20 (5–11)

17x2 +15x3 +19y1 +10y4 ≥ 20 (5–12)

19y1 +17y2 +15y3 +10y4 ≥ 20 (5–13)

14x1 +10x3 +5x4 +17y2 ≥ 15 (5–14)

12x2 +10x3 +5x4 +19y1 ≥ 15 (5–15)

10x3 +5x4 +19y1 +17y2 ≥ 15 (5–16)

x1 +x2 +x3 +10y4 ≥ 2 (5–17)

x1 +x2 +x3 +x4 ≥ 2 (5–18)

x1 ≥ 0 (5–19)

y1 ≥ 0 (5–20)

x1 ≤ 1 (5–21)

y1 ≤ 1 (5–22)

Among the inequalities in Example 5.1, we observe the upper bound inequalities

(5–21) and (5–22) that we know are facet-defining for PB because of Proposition 5.6. In

this example, the lower bound inequalities (5–19) and (5–20) are also facet-defining, as can

be established from Proposition 5.6. Finally, (5–13) is the trivial facet-defining inequality,

studied in Proposition 5.7. Our goal is now to discover families of valid inequalities for PB

that would explain (5–9)-(5–12) and (5–14)-(5–18).

To derive these nontrivial facet-defining inequalities, we first study the convex hull

of B when n = 2 with the goal of identifying seed inequalities for subsequent lifting

procedure. We next show in Proposition 5.8 that PB has three nontrivial facets when

n = 2. We assume in this study that a1 ≥ d and a2 ≥ d since otherwise PB is not

full-dimensional and its polyhedral structure is trivial.


B2 ={(x, y) ∈ {0, 1}2 × [0, 1]2

∣∣∣ a1x1y1 + a2x2y2 ≥ d},

113

where a1 ≥ d, a2 ≥ d and d > 0. Then,

conv(B2) = X :=

(x, y) ∈ [0, 1]2 × [0, 1]2

∣∣∣∣∣∣∣∣∣∣∣∣∣

x1+ x2 ≥ 1

dx1+ a2y2 ≥ d

a1y1+ dx2 ≥ d

a1y1+ a2y2 ≥ d

.

Proof. We prove the result using disjunctive programming techniques; see [17]. We define

X10 := B2 ∩ {x1 = 1, x2 = 0} = {(1, y1, 0, y2) | da1

≤ y1 ≤ 1, 0 ≤ y2 ≤ 1},X01 := B2 ∩ {x1 = 0, x2 = 1} = {(0, y1, 1, y2) | 0 ≤ y1 ≤ 1, d

a2≤ y2 ≤ 1},

X11 := B2 ∩ {x1 = 1, x2 = 1} = {(1, y1, 1, y2) | a1y1 + a2y2 ≥ d, 0 ≤ y1 ≤ 1, 0 ≤ y2 ≤ 1}.

It is easily verified that conv(B2) = conv(X10 ∪ X01 ∪ X11) = conv(X2 ∪ X11) where

X2 := conv(X10 ∪X01). We first use disjunctive programming techniques to obtain a linear

description of X2 and then compute conv(B2) as conv(X2 ∪ X11). Using Theorem 2.1 in

Balas [17], we write

X2 = proj(x,y)

(x1, y1, x2, y2, z1, z2, z1, z2, λ)

∣∣∣∣∣∣∣∣∣∣∣∣∣∣

(x1, y1, x2, y2) = (λ, z1 + z1, 1− λ, z2 + z2),

d

a1λ ≤ z1 ≤ λ, 0 ≤ z2 ≤ λ,

0 ≤ z1 ≤ 1− λ,d

a2(1− λ) ≤ z2 ≤ 1− λ,

0 ≤ λ ≤ 1

.

We then use Fourier-Motzkin elimination [141] to compute the projection. We first

eliminate the variables λ, z1 and z2 using the equations λ = x1, z1 = y1 − z1, and

z2 = y2 − z2. We then project the variables z1 and z2 from the system

x1 + x2 = 1, x1 ≥ 0,

114

andda1x1 ≤ z1 ≤ x1,

x1 + y1 − 1 ≤ z1 ≤ y1,

0 ≤ z2 ≤ 1− x2,

y2 − x2 ≤ z2 ≤ y2 − da2x2,

to obtain

X2 = conv(X10 ∪X01) =

(x1, y1, x2, y2)

∣∣∣∣∣∣∣

x1 + x2 = 1, x1 ≥ 0, x2 ≥ 0,

d

a1x1 ≤ y1 ≤ 1,

d

a2x2 ≤ y2 ≤ 1

.

Now, compute conv(X2 ∪X11) as

proj(x,y)

(x1, y1, x2, y2,

u1, u2, v1, v2, λ)

∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣

(x1, y1, x2, y2) = (u1 + (1− λ), v1 + v1, u2 + (1− λ), v2 + v2),

u1 + u2 = λ, u1 ≥ 0, u2 ≥ 0,

d

a1u1 ≤ v1 ≤ λ,

d

a2u2 ≤ v2 ≤ λ,

0 ≤ v1 ≤ 1− λ, 0 ≤ v2 ≤ 1− λ,

a1v1 + a2v2 ≥ d(1− λ), 0 ≤ λ ≤ 1

.

We again obtain the projection using Fourier-Motzkin elimination. Using the equations

x1 = u1 + 1 − λ, x2 = u2 + 1 − λ, and u1 + u2 = λ, we obtain that λ = 2 − (x1 + x2),

u1 = 1 − x2, and u2 = 1 − x1. Using these relations together with v1 = y1 − v1 and

v2 = y2 − v2 to eliminate the corresponding variables, we obtain that

x1 ≤ 1, x2 ≤ 1, 1 ≤ x1 + x2 ≤∗ 2,

and

x1 + x2 − 2 ≤∗ v1 ≤ y1 − da1(1− x2),

0 ≤ v1 ≤ x1 + x2 − 1,

−a2a1v2 +

da1(x1 + x2 − 1) ≤ v1,

x1 + x2 − 2 ≤∗ v2 ≤ y2 − da2(1− x1),

0 ≤ v2 ≤ x1 + x2 − 1,

115

where inequalities ≤∗ can be verified to be redundant. Projecting v1, we obtain that

x1 ≤ 1, x2 ≤ 1, 1 ≤ x1 + x2,

da1(1− x2) ≤ y1, x1 + x2 ≥∗ 1,

andda2x1 − a1

a2y1 ≤ v2,

(d−a1)a2

(x1 + x2 − 1) ≤∗ v2,

v2 ≤ y2 − da2(1− x1),

0 ≤ v2 ≤ x1 + x2 − 1,

Again, inequalities ≥∗ and ≤∗ are clearly redundant since x1 + x2 ≥ 1 and a1 ≥ d.

Projecting v2, we obtain that

x1 ≤ 1, x2 ≤ 1, 1 ≤ x1 + x2,

a1y1 + dx2 ≥ d,

dx1 + a2y2 ≥ d,

a1y1 + a2y2 ≥ d,

x1 + x2 ≥∗ 1,

(a2 − d)x1 + a2x2 + a1y1 ≥ a2. (R)

Clearly, inequality ≥∗ is repeated. (R) is also redundant since

(a2 − d)x1 + a2x2 + a1y1 − a2 = a2(x1 + x2 − 1)− dx1 + a1y1

≥ a2(x1 + x2 − 1)− dx1 + d(1− x2)

= (a2 − d)(x1 + x2 − 1) ≥ 0,

where the first inequality holds since a1y1 + dx2 ≥ d. Therefore, conv(X2 ∪ X11) has the

four inequalities of X, concluding the proof.

Next, we give generalizations of the nontrivial facets of conv(B2) that we prove are

facet-defining for more general instances of conv(B). In particular, we generalize inequality

116

dx1 + a2y2 ≥ d in Proposition 5.9 and inequality x1 + x2 ≥ 1 in Proposition 5.11. We will

use these generalizations as seed inequalities for lifting procedures in Section 5.3.

Proposition 5.9. Let L ⊆ {j ∈ N | aj > d}. Assume that∑

j∈N\L aj > d. Then,

∑j∈L

dxj +∑

j∈N\Lajyj ≥ d (5–23)

is facet-defining for PB.

Proof. If L = ∅, the result follows from Proposition 5.7. Hence, we assume that L 6= ∅.First, we show that (5–23) is valid for B. Assume for a contradiction that there exists

(x′, y′) ∈ B such that∑

j∈L dx′j +

∑j∈N\L ajy

′j < d. Then, x′

j = 0 for all j ∈ L. It follows

that d >∑

j∈N\L ajy′j ≥

∑j∈N\L ajx

′jy

′j =

∑j∈N ajx

′jy

′j, which is a contradiction to the fact

that (x′, y′) ∈ B.

Next, we prove that (5–23) is facet-defining for PB by providing 2n points (xi, yi)

in B that satisfy (5.9) at equality such that all solutions (α, β, δ) to αxi + βyi = δ for

i = 1, . . . , 2n yield inequalities αx + βy ≥ δ that are scalar multiples of (5–23). Consider

the 2|L| points pl = (el, el) and pl = (el, (1− ε)el) for l ∈ L where ε > 0 is sufficiently small.

It is clear that pl, pl belong to B for all l ∈ L, and that they satisfy (5–23) at equality.

From pl and pl, we obtain that αl + βl = δ and αl + (1 − ε)βl = δ, which implies that

αl = δ and βl = 0 for all l ∈ L. Next, we select an arbitrary element l ∈ L 6= ∅. Fork ∈ N \ L = {k1, . . . , kn−|L|}, construct the n − |L| points qk = (el + ek, el). Finally,

define d = d∑j∈N\L aj

and construct the n − |L| points qk1 =(∑

j∈N\L ej, d∑

j∈N\L ej)and

qki =(∑

j∈N\L ej, d∑

j∈N\L ej + ε( 1aki−1

eki−1− 1

akieki)

)for i = 2, . . . , n − |L| where ε is

sufficiently small. It can be verified that qk, qk belong to B for k ∈ N \ L since 0 < d < 1

and ε is small. Further, these points satisfy (5–23) at equality. From qk for k ∈ N \ L, weobtain that αl + αk + βl = δ, which implies that αk = 0 for all k ∈ N \ L since αl = δ and

βl = 0. Further, using the points qk for k ∈ N \ L, we obtain the system of equations:

∑

k∈N\Lαk + d

∑

k∈N\Lβk = δ, (5–24)

117

∑

k∈N\Lαk + d

∑

k∈N\Lβk + ε

(βki−1

aki−1

− βki

aki

)= δ, i = 2, . . . , n− |L|, (5–25)

which implies that there exists θ s.t.βki−1

aki−1=

βki

aki= θ for i = 2, . . . , n − |L|. Plugging

these expressions into (5–24), we obtain that βk = δdak for all k ∈ N \ L. Therefore, we

conclude that αl =δdd for l ∈ L and βk = δ

dak for k ∈ N \ L, which proves that (5–23) is

facet-defining.

Note that in Example 5.1, the inequality (5–13) can be obtained using Proposition 5.9

with L = ∅. In the remainder of the chapter, we use the following notation extensively. For

N0, N1 ⊆ N such that N0 ∩N1 = ∅ and N0, N1 ⊆ N such that N0 ∩ N1 = ∅, we let

B(N0, N1, N0, N1) :=

(x, y) ∈ B

∣∣∣∣∣∣∣xj = 0 for j ∈ N0, xj = 1 for j ∈ N1,

yj = 0 for j ∈ N0, yj = 1 for j ∈ N1

.

We also define PB(N0, N1, N0, N1) := conv(B(N0, N1, N0, N1)). In particular, B(∅, ∅, ∅, N)

is the 0−1 knapsack set

B(∅, ∅, ∅, N) =

{x ∈ {0, 1}n

∣∣∣n∑

j=1

ajxj ≥ d

},

whose polyhedral structure was first studied by Balas [14], Wolsey [130], and Hammer

et al. [63]. We next show some relations between the bilinear set B and the 0−1 knapsack

set B(∅, ∅, ∅, N).

Proposition 5.10. Let∑j∈N

αjxj +∑j∈I

βjyj ≥ δ (5–26)

be a facet-defining inequality for PB(∅, ∅, ∅, N \ I) that is not a bound. Then, (5–26) is

facet-defining for PB(∅, ∅, ∅, N \ I) if and only if (5–26) is facet-defining for PB.

Proof. We first prove that if (5–26) is facet-defining for PB(∅, ∅, ∅, N \ I), then (5–26) is

facet-defining for PB. To show that (5–26) is valid for B, we assume for a contradiction

that there exists a point (x′, y′) ∈ B with∑

j∈N αjx′j +

∑j∈I βjy

′j < δ. Since (x′, y′) ∈ B,

118

we have that∑

j∈N ajx′jy

′j ≥ d. Next, we define (x, y) as x = x′, yj = y′j for j ∈ I,

and yj = 1 for j ∈ N \ I. Observe that (x, y) ∈ B(∅, ∅, ∅, N \ I) as∑

j∈I ajxj yj +∑

j∈N\I ajxj ≥∑

j∈N ajx′jy

′j ≥ d. Since (5–26) is valid for B(∅, ∅, ∅, N \ I), (x, y) satisfies

∑j∈N αjx

′j +

∑j∈I βjy

′j =

∑j∈N αjxj +

∑j∈I βj yj ≥ δ. This is the desired contradiction.

Next, we show that (5–26) is facet-defining for PB. Since (5–26) is facet-defining

for PB(∅, ∅, ∅, N \ I) and δ 6= 0 as (5–26) is not a bound, there exist n + |I| linearlyindependent points in PB(∅, ∅, ∅, N \ I) that satisfy (5–26) at equality. Let (xk, yk) be

these points. Clearly, (xk, yk) for k = 1, . . . , n + |I| belong to B and satisfy (5–26) at

equality. Now, for each j ∈ N \ I, we construct one new point in B \ B(∅, ∅, ∅, N \ I) thatsatisfies (5–26) at equality. Since (5–26) is not a bound, there exists kj for all j ∈ N \ I

such that xkjj = 0, but xk

j = 1 for some k 6= kj. For each j ∈ N \ I, pick (xkj , ykj) and

define a new point (xkj , ykj) such that xkji = x

kji ∀i ∈ N , y

kji = y

kji ∀i ∈ N \ {j} and

ykjj = 0. Clearly, (xkj , ykj) belongs to B and satisfies (5–26) at equality. Further, it is

easily seen that together with (xk, yk) all (xkj , ykj) are linearly independent and therefore

show that (5–26) is facet-defining for PB.

To prove the reverse implication, we assume that (5–26) is a nontrivial facet-defining

inequality for PB. Validity is trivial since for B(∅, ∅, ∅, N \ I) ⊆ B. Now, we show that

(5–26) is facet-defining for PB(∅, ∅, ∅, N \ I). Since δ 6= 0 as (5–26) is not a bound, the

set of 2n affinely independent points (xk, yk) for k = 1, . . . , 2n in B that satisfy (5–26) at

equality are also linearly independent. Therefore,

∣∣∣∣∣∣∣∣∣∣∣∣∣

x11 . . . x1

n y11 . . . y1n

x21 . . . x2

n y21 . . . y2n

. . . . . .

x2n1 . . . x2n

n y2n1 . . . y2nn

∣∣∣∣∣∣∣∣∣∣∣∣∣

6= 0.

119

It can be verified that there exist n + |I| rows i1, . . . , in+|I| where I = {j1, . . . , j|I|} such

that ∣∣∣∣∣∣∣∣∣∣∣∣∣

xi11 . . . xi1

n yi1j1 . . . yi1j|I|

xi21 . . . xi2

n yi2j1 . . . yi2j|I|

. . . . . .

xin+|I|1 . . . x

in+|I|n y

in+|I|j1

. . . yin+|I|j|I|

∣∣∣∣∣∣∣∣∣∣∣∣∣

6= 0.

Hence, we see that n + |I| points (xik1 , . . . , x

ikn ; y

ikj1, . . . , yikj|I|) for k = 1, . . . , n + |I| are

linearly independent. Now, define the points (xik , yik) for k = 1, . . . , n + |I| such that

xik = xik , yikj = yikj for j ∈ I, and yikj = 1 for j ∈ N \ I. The points (xik , yik) are feasible

to B(∅, ∅, ∅, N \ I) and satisfy (5–26) at equality. Therefore, we conclude that (5–26) is

facet-defining for PB(∅, ∅, ∅, N \ I).

Observe that Proposition 5.10 implies that all nontrivial facets of the 0−1 knapsack

polytope can be found in B and that it is sufficient to study the facets of B to know

the facets of the 0−1 knapsack polytope. Next, we use Proposition 5.10 to generalize

inequality x1 + x2 ≥ 1 in Proposition 5.8 into an inequality that we will use as a seed for

lifting procedures in Section 5.3.3.

Proposition 5.11. Assume that∑

j∈N aj − ak − am < d for all k,m ∈ N with k 6= m. The

cover inequality∑j∈N

xj ≥ |N | − 1 (5–27)


Proof. Because of Proposition 5.10. it is sufficient to prove that (5–27) is facet-defining

for PB(∅, ∅, ∅, N). To prove validity, assume for a contradiction that there exists x′ ∈B(∅, ∅, ∅, N) such that

∑j∈N ajx

′j ≥ d and

∑j∈N x′

j ≤ |N | − 2. Since∑

j∈N x′j ≤ |N | − 2,

there exist k,m ∈ N with k 6= m such that x′k = 0 and x′

m = 0. Therefore,∑

j∈N aj − ak −am ≥ ∑

j∈N ajx′j ≥ d. This contradicts the assumption that

∑j∈N aj − ak − am < d for all

k,m ∈ N with k 6= m. We next show that (5–27) is facet-defining for PB(∅, ∅, ∅, N). It can

120

be easily verified using Assumption 5.2 that the points pk = (1− ek,1) for k ∈ N belong to

B(∅, ∅, ∅, N). Since these points are linearly independent and satisfy (5–27) at equality, we

conclude that (5–27) is facet-defining for PB(∅, ∅, ∅, N).

5.3 Lifted Inequalities

In this section, we derive three families of strong valid inequalities for PB via lifting.

The first two families are obtained using sequence-independent lifting from (5–23) and are

facet-defining for PB. In this case, lifting is simple since the lifting function is subadditive.

The third inequality is obtained by lifting (5–27). Although the lifting function associated

with this seed inequality is not subadditive, we obtain strong lifted inequalities using

approximate lifting. We also identify conditions under which these lifted inequalities are

facet-defining for PB.

5.3.1 Sequence-Independent Lifting for Bilinear Covering Sets

Sequence-independent lifting is a well-known technique to construct strong valid

inequalities for mixed-integer linear programs; see Wolsey [132] and Gu et al. [62]. We

next give a brief description of how the technique can be used to derive strong valid

inequalities for PB. A more general treatment of lifting in nonlinear programming is given

in Richard and Tawarmalani [100].

Given ∅ 6= S ( N , consider B(S, ∅, S, ∅), which is the restriction of B obtained when

all variables (xj, yj) for j ∈ S are fixed to (0, 0). Let S = {s, . . . , n} for some s ≥ 2 and

define Si = {i+ 1, . . . , n} for i ∈ S. Assume that the inequality

s−1∑j=1

αjxj +s−1∑j=1

βjyj ≥ δ (5–28)

is facet-defining for PB(S, ∅, S, ∅). In sequential lifting, we reintroduce the variables

(xj, yj) for j ∈ S one at the time in (5–28). Assuming that variables (xj, yj) have already

been lifted in the order j = s, . . . , i − 1, we next review how to lift variables (xi, yi) in the

121

inequalityi−1∑j=1

αjxj +i−1∑j=1

βjyj ≥ δ, (5–29)

which is assumed to be facet-defining for PB(Si−1, ∅, Si−1, ∅). To perform this lifting, we

first compute the lifting function

P i(w) = max δ −{

i−1∑j=1

αjxj +i−1∑j=1

βjyj

}

s.t.

i−1∑j=1

ajxjyj ≥ d− w

xj ∈ {0, 1}, yj ∈ [0, 1] j = 1, . . . , i− 1.

Once the lifting function P i(w) is computed, the lifting coefficients (αi, βi) can then be

obtained from P i(w) as follows.

Proposition 5.12 (Richard and Tawarmalani [100]). Let (5–29) be a valid inequality for

the set B(Si−1, ∅, Si−1, ∅). Assume that there exist (αi, βi) such that

αixi + βiyi ≥ P i(aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–30)

Then, the inequalityi∑

j=1

αjxj +i∑

j=1

βjyj ≥ δ (5–31)

is valid for B(Si, ∅, Si, ∅).The result of Proposition 5.12 can be applied recursively to construct a valid

inequality for PB from (5–28). Note that, at each step, the lifting function P i(w) must be

recomputed to account for the changes in the lifted inequality. Further, if B(S, ∅, S, ∅) isfull-dimensional, the seed inequality (5–28) is facet-defining for B(S, ∅, S, ∅), and for each

i ∈ S, the lifting coefficients (αi, βi) of the variables (xi, yi) are chosen so that (5–30) is

satisfied at equality by two points (xi, yi), then the lifted inequality will be facet-defining

for PB. Computing the lifting functions P i(w) for each i ∈ S might be computationally

undesirable. However, such computation is unnecessary when the lifting function P s(w) is

122

subadditive as described in Proposition 5.13. This observation, first made by Wolsey [132],

leads to the following result.

Proposition 5.13 (Richard and Tawarmalani [100]). Assume that (5–28) is valid

for B(S, ∅, S, ∅). Assume also that (i) P s(w) is subadditive, i.e, P s(w1) + P s(w2) ≥P s(w1 + w2) ∀w1, w2 ∈ R+ and (ii) there exist (αi, βi) for all i ∈ S such that

αixi + βiyi ≥ P s(aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–32)

Then, the inequalityn∑

j=1

αjxj +n∑

j=1

βjyj ≥ δ (5–33)

is valid for PB. Further, if (5–28) is facet-defining for B(S, ∅, S, ∅) and (αi, βi) are chosen

in a way that two points satisfy (5–32) at equality, then (5–33) is facet-defining for PB.

The main difference between Proposition 5.12 and Proposition 5.13 is that, in the

latter, the lifting coefficients of all variables (xi, yi) can be obtained from the same lifting

function P s(w) and not from P i(w) for i ∈ S. Note that in Proposition 5.13, it is sufficient

to require the subadditivity of P s(w) over w ∈ R+ since all coefficients ai in PB are

assumed to be nonnegative.

Proposition 5.12 and Proposition 5.13 consider the case where all variables (xj, yj) for

j ∈ S are fixed at (0, 0). When variables (xj, yj) are fixed at (1, 1), similar results can be

obtained. In this case, condition (5–30) must be changed to

αi(1− xi) + βi(1− yi) ≤ −P i(aixiyi − ai) (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–34)

Further, we can perform sequence-independent lifting for variables (xj, yj) fixed at (1, 1) if

the lifting function P i(w) is subadditive over w ∈ R−.

5.3.2 Lifted Inequalities by Sequence-Independent Lifting

To derive a strong inequality through lifting, we first must obtain a seed inequality. In

this section, we will use (5–23) as the seed inequality. To identify this form of inequality,

123

we introduce the notion of a cover, which is adapted from the definition of a cover for the

0−1 knapsack polytope; see Balas [14], Wolsey [130], and Hammer et al. [63].

Definition 5.1. Let C ⊆ N . We say that C is a cover for B if∑

j∈C aj > d. Further, we

define the excess of the cover as µ =∑

j∈C aj − d > 0.

We will create lifted inequalities by first partitioning the set of variables N into

(C,M, T ) in such a way that:

(A1) C is a cover for B with excess µ,

(A2) al > µ where l ∈ argmax{aj | j ∈ C},(A3)

∑j∈C∪T aj > d+ al, i.e.,

∑j∈T aj > al − µ.

Note that (A1) and (A2) might be reminiscent of conditions that make a cover

minimal for the 0−1 knapsack polytope. We note however that minimal covers require

aj > µ for all j ∈ C and not simply al > µ. Note also that (A3) implies that T 6=∅. To obtain lifted inequalities from (C,M, T ), we first fix the variables (xj, yj) for

j ∈ M to (0, 0) and the variables (xj, yj) for j ∈ C \ {l} to (1, 1). The resulting set

B(M,C \ {l},M,C \ {l}) is then defined by the inequality

alxlyl +∑j∈T

ajxjyj ≥ d−∑

j∈C\{l}aj = al − µ.

From Assumption (A3), we observe that∑

j∈T aj > al + d − ∑j∈C aj = al − µ. Since

al > µ > 0, we conclude from Proposition 5.9 that

(al − µ)xl +∑j∈T

ajyj ≥ al − µ (5–35)

is facet-defining for PB(M,C \ {l},M,C \ {l}). We will create two different families of

lifted inequalities for PB by reintroducing the variables (xj, yj) for j ∈ M ∪ C \ {l} in

different orders. To derive both facets, we first must compute the lifting function

P (w) := max (al − µ)−{(al − µ)xl +

∑j∈T

ajyj

}

124

s.t. alxlyl +∑j∈T

ajxjyj ≥ al − µ− w (5–36)

xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ {l} ∪ T.

Function P (w) can be expressed in closed-form as follows.

Proposition 5.14.

P (w) =

−∞ if w < −∑j∈T aj − µ,

w + µ if −∑j∈T aj − µ ≤ w < −µ,

0 if −µ ≤ w < 0,

w if 0 ≤ w < al − µ,

al − µ if al − µ ≤ w.

Further, P (w) is subadditive over R− and R+ respectively.

Proof. We first compute P (w). Observe that there exists an optimal solution (x∗, y∗) to

(5–36) for which x∗j = 1 for all j ∈ T and y∗l = 1 since the coefficients of xj for j ∈ T and

yl in the objective are equal to 0. Further, define a =∑

j∈T aj and y =∑

j∈T ajyj

a. Using

these definitions, we can simplify the computation of P (w) in (5–36) as:

P (w) = max (al − µ)−{(al − µ)xl + ay

}

s.t. alxl + ay ≥ al − µ− w (5–37)

xl ∈ {0, 1}, y ∈ [0, 1].

Now, we solve (5–37). When w < −a − µ, (5–37) is infeasible and so P (w) = −∞. When

w ≥ al − µ, the optimal solution is x∗l = 0 and y∗ = 0. When −a − µ ≤ w < al − µ, it is

simple to verify that optimal solutions to (5–37) are given by:

(x∗l , y

∗) =

(1, −µ−wa

) if −a− µ ≤ w < −µ,

(1, 0) if −µ ≤ w < 0,

(0, al−µ−wa

) if 0 ≤ w < al − µ.

125

Using (x∗l , y

∗) in (5–37) and substituting back∑

j∈T aj for a, we obtain the desired

expression for P (w).

Next, we prove that P (w) is subadditive over (−∞, 0] by showing that, for w1, w2 ∈R−, P (w1) + P (w2) ≥ P (w1 + w2). We consider the following three cases:

1. Assume −a − µ ≤ w1 < −µ and −a − µ ≤ w2 < −µ. If w1 + w2 < −a − µ, then

P (w1) + P (w2) = w1 + w2 + 2µ > −∞ = P (w1 + w2). If −a − µ ≤ w1 + w2 < −µ,

then P (w1) + P (w2) = w1 + w2 + 2µ ≥ w1 + w2 + µ = P (w1 + w2).

2. Assume −a − µ ≤ w1 < −µ and −µ ≤ w2 ≤ 0. If w1 + w2 < −a − µ, then

P (w1) + P (w2) = w1 + µ+ 0 > −∞ = P (w1 + w2). If −a− µ ≤ w1 + w2 < −µ, then

P (w1) + P (w2) = w1 + µ+ 0 ≥ w1 + w2 + µ = P (w1 + w2).

3. Assume −µ ≤ w1 ≤ 0 and −µ ≤ w2 ≤ 0. If w1 + w2 < −a − µ, then P (w1) +

P (w2) = 0 + 0 > −∞ = P (w1 + w2). If −a − µ ≤ w1 + w2 < −µ, then

P (w1) + P (w2) = 0 + 0 ≥ w1 + w2 + µ = P (w1 + w2). If −µ ≤ w1 + w2 < 0, then

P (w1) + P (w2) = 0 + 0 = 0 = P (w1 + w2).

Finally, we show that P (w) is subadditive over R+. We consider the following three cases:

1. Assume 0 ≤ w1 < al − µ and 0 ≤ w2 < al − µ. If w1 + w2 < al − µ, then

P (w1) + P (w2) = w1 + w2 = P (w1 + w2). If w1 + w2 ≥ al − µ, then P (w1) + P (w2) =

w1 + w2 ≥ al − µ = P (w1 + w2).

2. Assume 0 ≤ w1 < al − µ and w2 ≥ al − µ. Since w1 + w2 ≥ al − µ as w1 ≥ 0,

P (w1) + P (w2) = w1 + al − µ ≥ al − µ = P (w1 + w2).

3. Assume w1 ≥ al − µ and w2 ≥ al − µ. Since w1 + w2 ≥ al − µ, P (w1) + P (w2) =

al − µ+ al − µ > al − µ = P (w1 + w2).

It is interesting to note that P (w) is not subadditive over R as P (2al − µ) + P (−al) =

(al − µ) + (−al + µ) = 0 < al − µ = P (al − µ).

126

5.3.2.1 Lifted bilinear cover inequalities

To obtain lifted bilinear cover inequalities, we will lift first the variables (xi, yi) for

i ∈ C \ {l} from (1, 1) and then lift the variables (xi, yi) for i ∈ M from (0, 0). Since P (w)

is subadditive over (−∞, 0], we can apply sequence-independent lifting for the variables

(xi, yi) for i ∈ C \ {l} using the result of Proposition 5.13.

Proposition 5.15. Under Assumptions (A1), (A2), and (A3),

∑j∈C

(aj − µ)+xj +∑j∈T

ajyj ≥∑j∈C

(aj − µ)+ (5–38)

is facet-defining for PB(M, ∅,M, ∅).

Proof. The seed inequality (5–35) is facet-defining for PB(M,C \ {l},M,C \ {l}).Since P (w) is subadditive over (−∞, 0], it follows from Proposition 5.13 that the lifting

coefficients (αi, βi) for (xi, yi) for i ∈ C \ {l} are valid if they satisfy

αi(xi − 1) + βi(yi − 1) ≥ P (aixiyi − ai) for (xi, yi) ∈ {0, 1} × [0, 1] \ {1, 1}. (5–39)

This condition can be also written as:

βi ≤ inf0≤φ<1

−P (aiφ− ai)

1− φ, (5–40)

αi + sup0≤φ≤1

βi(1− φ) ≤ −P (−ai). (5–41)

In (5–40), it is easily verified using Assumption (A3) that aiφ− ai ∈ (−∑j∈T aj − µ, 0) for

0 ≤ φ < 1. Since P (w) ≤ 0 for w ≤ 0, we conclude that

−P (aiφ− ai)

1− φ≥ 0, ∀ 0 ≤ φ < 1,

and therefore choosing βi = 0 for i ∈ C \ {l} satisfies (5–40). Further, as βi = 0, it

is simple to verify that choosing αi = −P (−ai) = (ai − µ)+ satisfies (5–41). Finally,

note that (5–39) is satisfied at equality by the two points (0, 0) and(1, (ai−µ)+

ai

)that are

affinely independent of (1, 1). Therefore, we conclude that (5–38) is facet-defining for

PB(M, ∅,M, ∅).

127

Now, we lift the variables (xj, yj) for j ∈ M in (5–38). The corresponding lifting

function PC(w) is defined as

PC(w) := max∑j∈C

(aj − µ)+ −{∑

j∈C(aj − µ)+xj +

∑j∈T

ajyj

}

s.t.∑

j∈C∪Tajxjyj ≥

∑i∈C

ai − µ− w (5–42)

xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ C ∪ T.

We now derive a closed-form expression for PC(w). To this end, we assume without loss

of generality that C = {1, . . . , p} and that a1 ≥ a2 ≥ . . . ≥ ap. We also let q ∈ C such

that aq > µ ≥ aq+1. We define A0 = 0 and Ai =∑i

j=1 aj for all i ∈ C. We observe that

Ap =∑p

j=1 aj = d+ µ.

Proposition 5.16. For w ≥ 0,

PC(w) =

w − iµ if Ai ≤ w < Ai+1 − µ, i = 0, . . . , q − 1,

Ai − iµ if Ai − µ ≤ w < Ai, i = 1, . . . , q − 1,

Aq − qµ if Aq − µ ≤ w.

Proof. First, observe that there exists an optimal solution (x∗, y∗) of (5–42) in which

x∗j = 1 for j ∈ T and y∗j = 1 for j ∈ C since the corresponding objective coefficients are

zero. Since aq > µ ≥ aq+1, we have (aj − µ)+ = 0 for j = q + 1, . . . , p, which similarly

implies that we can assume x∗j = 1 for j = q + 1, . . . , p. Further, using the same notations

a and y as in the proof of Proposition 5.14, we can simplify the expression of PC(w) as

PC(w) = max

q∑j=1

(aj − µ)−{

q∑j=1

(aj − µ)xj + ay

}

s.t.

q∑j=1

ajxj + ay ≥q∑

j=1

aj − µ− w (5–43)

xj ∈ {0, 1}, j = 1 . . . , q, y ∈ [0, 1].

Next, we compute PC(w) by solving (5–43). Let w = Aq − µ − w. We claim that there

exists an optimal solution in which x∗1 ≤ x∗

2 ≤ . . . ≤ x∗q. Consider the following three cases.

128

1. Assume that w ≥ Aq−µ. Since w ≤ 0, x∗j = 0 for j = 1, . . . , q and y∗ = 0 is a feasible

solution that is easily verified to be optimal. Therefore, PC(w) = Aq − qµ.

2. Assume that Ai − µ ≤ w < Ai+1 − µ (i.e., Aq − Ai+1 < w ≤ Aq − Ai) for

i ∈ {1 . . . , q − 1}. Let θ = (Ai+1 − µ) − w. Clearly, 0 < θ ≤ ai+1. Define first the

solution s1i = (x∗, y∗) where x∗j = 0 for j = 1, . . . , i + 1, x∗

j = 1 for j = i + 2, . . . , q,

and y∗ = θa. When θ ≤ a, s1i is a feasible solution whose objective value we denote as

Γ∗(s1i ) = Ai+1 − (i + 1)µ − θ = w − iµ. Define now another solution s2i = (x∗, y∗)

where x∗j = 0 for j = 1, . . . , i, x∗

j = 1 for j = i + 1, . . . , q, and y∗ = 0. s2i is a feasible

solution with objective value Γ∗(s2i ) = Ai − iµ. Since ai+1 − µ ≤ a1 − µ ≤ a, it

is clear that Γ∗(s1i ) ≥ Γ∗(s2i ) when θ ≤ ai+1 − µ. Further, Γ∗(s2i ) ≥ Γ∗(s1i ) when

ai+1 − µ ≤ θ ≤ ai+1. Therefore, we conclude that P (w) ≥ Γ∗(s1i ) if Ai − µ ≤ w ≤ Ai

and P (w) ≥ Γ∗(s2i ) if Ai ≤ w < Ai+1 − µ.

We now prove that these solutions are optimal. Pick any other feasible solution

si = (x, y) that does not have the form of x1 ≤ x2 ≤ . . . ≤ xq. Define N1 = {j ∈{1, . . . , q} | xj = 1} and N0 = {j ∈ {1, . . . , q} | xj = 0}. Consider the case where

|N1| = q − i + k for k ≥ 0. Since∑q

j=1 ajxj + ay ≥ ∑qj=1 ajxj ≥ Aq − Ai−k, the

corresponding objective value is Γ∗(si) =∑q

j=1(aj−µ)(1−xj)−ay ≤ Ai−k−(i−k)µ =

Ai−iµ−∑ij=i−k+1(aj−µ) ≤ Γ∗(s2i ). Next, consider the case where |N1| = q−i−k for

k ≥ 1. Since∑q

j=1 ajxj + ay ≥ w = Aq − Ai+1 + θ from feasibility, the corresponding

objective value is Γ∗(si) =∑q

j=1(aj − µ)(1− xj)− ay ≤ Ai+1 − θ − µ(i+ k) ≤ Γ∗(s1i ).

Note that if s1i is infeasible, s2i is always feasible and dominates it.

3. Assume that 0 ≤ w < A1 − µ. Since A1 − µ ≤ a, the feasible solution x∗1 = 0, x∗

j = 1

for j = 2, . . . , q, and y∗ = A1−µ−wa

is optimal, which implies that PC(w) = w.

We now will perform sequence-independent lifting for the remaining variables in M

using Proposition 5.13. In order to apply this result. we first establish that PC(w) is

subadditive. To this end, we use the following proposition.

129

Proposition 5.17. Let ν and Di for i = 0, 1, . . . , r be nonnegative integers. Assume that

ν > 0, D0 = 0, and Di ≥ Di−1 + ν for i = 1, . . . , r. Then the function

g(w) :=

w − iν if Di ≤ w < Di+1 − ν, i = 0, . . . , r − 1,

Di − iν if Di − ν ≤ w < Di, i = 1, . . . , r − 1,

Dr − rν if Dr − ν ≤ w

is subadditive over R+ if and only if Di +Dj ≥ Di+j for 0 ≤ i, j ≤ r with i+ j ≤ r.

Proof. Assume that Di +Dj ≥ Di+j for 0 ≤ i ≤ j ≤ r with i + j ≤ r. We want to prove

that g(x) + g(y) ≥ g(x + y) for x, y ∈ R+. Assume for a contradiction that there exists

x, y ∈ R+ such that g(x) + g(y) < g(x + y). We claim first that there exists x′, y′ ∈ R+

with x′ = Di for some i ∈ {0, . . . , r} such that g(x′) + g(y′) < g(x′ + y′). We consider three

cases:

1. If Di ≤ x ≤ Di+1 − ν for i ∈ {0, . . . , r − 1}, then let x′ = Di and y′ = y. Clearly,

g(x′) = g(x) + Di − x and g(y′) = g(y). Further, g(x′ + y′) = g(x + y + Di − x) ≥g(x + y) + Di − x since Di ≤ x and the function g has slope 0 or 1. Therefore, we

have that g(x′) + g(y′) = g(x) + g(y) +Di − x < g(x+ y) +Di − x ≤ g(x′ + y′).

2. If Di−ν ≤ x ≤ Di for i ∈ {1, . . . , r−1}, then let x′ = Di and y′ = y+x−Di. Clearly,

g(x′) = g(x) and g(y′) ≤ g(y) since x ≤ Di and g is nondecreasing. Therefore, we

have that g(x′) + g(y′) ≤ g(x) + g(y) < g(x+ y) = g(x′ + y′).

3. If Dr−ν ≤ x, then let x′ = Dr−ν and y′ = y. Clearly, g(x′) = g(x) and g(y′) = g(y).

Further, g(x′ + y′) = g(Dr − ν + y) = g(x + y) since y ≥ 0 and x + y ≥ Dr − ν.

Therefore, we have that g(x′) + g(y′) = g(x) + g(y) < g(x+ y) = g(x′ + y′).

We claim next that there exists x, y ∈ R+ with x = Di and y = Dj for some i, j ∈{0, . . . , r} such that g(x) + g(y) < g(x+ y). We consider three cases:

1. If Dj ≤ y′ ≤ Dj+1 − ν for j ∈ {0, . . . , r − 1}, then let x = x′ and y = Dj. Clearly,

g(x) = g(x′) and g(y) = g(y′) +Dj − y′. Further, g(x + y) = g(x′ + y′ +Dj − y′) ≥

130

g(x′ + y′) +Dj − y′ since Dj ≤ y′ and the function g has slope 0 or 1. Therefore, we

have that g(x) + g(y) = g(x′) + g(y′) +Dj − y′ < g(x′ + y′) +Dj − y′ ≤ g(x+ y).

2. If Dj − ν ≤ y′ ≤ Dj for j ∈ {1, . . . , r − 1}, then let x = x′ and y = Dj. Clearly,

g(x) = g(x′) and g(y) = g(y′). Further, g(x + y) ≥ g(x′ + y′) since y′ ≤ Dj and g is

nondecreasing. Therefore, we have that g(x) + g(y) = g(x′) + g(y′) < g(x′ + y′) ≤g(x+ y).

3. If Dr−ν ≤ y′, then let x = x′ and y = Dr−ν. Clearly, g(x) = g(x′) and g(y) = g(y′).

Further, g(x + y) = g(x′ + Dr − ν) = g(x′ + y′) since x′ ≥ 0 and x′ + y′ ≥ Dr.

Therefore, we have that g(x) + g(y) = g(x′) + g(y′) < g(x′ + y′) = g(x+ y).

We conclude that there exist i, j ∈ {0, . . . , r} such that g(Di) + g(Dj) < g(Di +Dj). Since

g(Di) = Di − iν and g(Dj) = Dj − jν, we have that Di +Dj − (i+ j)ν < g(Di +Dj). We

consider the following three cases:

1. If Dq ≤ Di +Dj < Dq+1 − ν for q ∈ {0, . . . , r − 1}, then g(Di +Dj) = Di +Dj − qν.

Since Di+Dj−(i+j)ν < g(Di+Dj), it follows that Di+Dj−(i+j)ν < Di+Dj−qν,

which implies that q < i + j. Since q is integer, we obtain that q + 1 ≤ i + j and

Dq+1 ≤ Di+j. Further, since Di +Dj < Dq+1 − ν < Dq+1 ≤ Di+j, we conclude that

Di +Dj < Di+j, which is a contradiction to the hypothesis Di +Dj ≥ Di+j.

2. If Dq − ν ≤ Di +Dj < Dq for q ∈ {1, . . . , r − 1}, then g(Di +Dj) = Dq − qν. Since

Di +Dj − (i+ j)ν < g(Di +Dj), it follows that Di +Dj − (i+ j)ν < Dq − qν. Using

Dq−ν ≤ Di+Dj, we have that Dq−ν−(i+j)ν ≤ Di+Dj−(i+j)ν < Dq−qν, which

implies that i + j + 1 > q. Since q is integer, it is clear that q ≤ i + j. Further, since

Di +Dj < Dq ≤ Di+j, we conclude that Di +Dj < Di+j, which is a contradiction.

3. If Di+Dj ≥ Dr−ν, then g(Di+Dj) = Dr−rν. Since Di+Dj−(i+j)ν < g(Di+Dj),

it follows that Di +Dj − (i+ j)ν < Dr − rν. Using Dr − ν ≤ Di +Dj, we obtain that

Dr − ν − (i + j)ν ≤ Di +Dj − (i + j)ν < Dr − rν, which implies that i + j + 1 > r.

Since i, j,and r are integers, i + j ≥ r. Further, since Dl ≥ Dl−1 + ν for l = 1, . . . , r,

we have that Dj ≥ Dr−i + (j + i − r)ν. Combining Dj ≥ Dr−i + (j + i − r)ν and

131

Di +Dj − (i+ j)ν < Dr − rν, we obtain that Di +Dr−i − rν < Dr − rν. Therefore,

we conclude that Di +Dr−i < Dr, which is a contradiction. This completes the first

part of the proof.

To prove the reverse implication, assume now that g is subadditive. We want to prove

that Di + Dj ≥ Di+j for 0 ≤ i ≤ j ≤ r with i + j ≤ r. As shown before, we can take

i and j such that g(Di) = Di − iν and g(Dj) = Dj − jν. Since g is subadditive, i.e.,

g(Di) + g(Dj) ≥ g(Di +Dj), it follows that Di +Dj − (i+ j)ν ≥ g(Di +Dj). We consider

the following three cases:

1. If Dq ≤ Di +Dj < Dq+1 − ν for q ∈ {0, . . . , r − 1}, then g(Di +Dj) = Di +Dj − qν.

Since Di+Dj−(i+j)ν ≥ g(Di+Dj), it follows that Di+Dj−(i+j)ν ≥ Di+Dj−qν,

which implies q ≥ i+ j and Dq ≥ Di+j. Therefore, we obtain that Di +Dj ≥ Di+j.

2. If Dq − ν ≤ Di + Dj < Dq for q ∈ {1, . . . , r − 1}, then g(Di + Dj) = Dq − qν >

Di+Dj−qν. Since Di+Dj−(i+j)ν ≥ g(Di+Dj), it follows that Di+Dj−(i+j)ν >

Di + Dj − qν, which implies that i + j < q. Using Dq ≥ Di+j + (q − (i + j))ν, we

obtain that Di +Dj − (i+ j)ν ≥ Dq − qν ≥ Di+j − (i+ j)ν. Therefore, we conclude

that Di +Dj ≥ Di+j.

3. If Di+Dj ≥ Dr−ν, then g(Di+Dj) = Dr−rν. Since Di+Dj−(i+j)ν ≥ g(Di+Dj),

it follows that Di +Dj − (i + j)ν ≥ Dr − rν. Since i + j ≤ r from the assumption,

Dr ≥ Di+j+(r−(i+j))ν. Therefore, we obtain that Di+Dj−(i+j)ν ≥ Di+j−(i+j)ν,

which concludes that Di +Dj ≥ Di+j.

We now prove that the lifting function PC(w) is subadditive.

Corollary 5.1. The lifting function PC(w) is subadditive over R+.

Proof. In Proposition 5.17, define ν = µ, r = q, and Di = Ai. Since ai ≥ µ for i = 1, . . . , q,

it is clear that Ai ≥ Ai−1 + µ. Further, since Ai is defined as the sum of the largest i

coefficients in C, it is clear that Ai + Aj ≥ Ai+j for 0 ≤ i, j ≤ p with i + j ≤ p. Therefore,

Proposition 5.17 shows that PC(w) is subadditive over R+.

132

We next illustrate the results of Propositions 5.15 and 5.16 as well as Corollary 5.1 on

an example.

Example 5.2. Consider the 0−1 mixed-integer bilinear covering set

B ={(x, y) ∈ {0, 1}5 × [0, 1]5

∣∣∣ 21x1y1 + 19x2y2 + 17x3y3 + 15x4y4 + 10x5y5 ≥ 20}.

Let (C,M, T ) = ({4, 5}, {1, 2}, {3}). Clearly, (C,M, T ) satisfies Assumptions (A1) − (A3)

since C is a cover with µ = 5, al = a4 > µ and∑

j∈C∪T aj = 17+15+10 > 20+15 = d+al.

We obtain from Proposition 5.15 that the inequality

17y3 + 10x4 + 5x5 ≥ 15 (5–44)

is facet-defining for PB(M, ∅,M, ∅). Using the result of Proposition 5.16, we obtain that

the lifting function PC(w) is given by

PC(w) =

w if 0 ≤ w < 15− 5 = 10,

10 if 10 ≤ w < 15,

w − 5 if 15 ≤ w < 15 + 10− 5 = 20,

20− 5 if 20 ≤ w.

Corollary 5.1 establishes that this function is subadditive over R+. Function PC(w) is

represented in Figure 5-1.

We now compute the lifting coefficients of variables (xi, yi) for i ∈ M from PC(w). It

follows from Proposition 5.13 that lifting coefficients (αi, βi) for i ∈ M must be chosen in

such a way that

αixi + βiyi ≥ PC(aixiyi) for ∀(xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–45)

For the problem described in Example 5.2, PC(aixiyi) is represented in Figure 6-2 (a). In

this figure, we obtain that PC(aixiyi) is constant when xi = 0 and is equal to PC(aiyi)

when xi = 1. Condition (5–45) requires that the lifting coefficients (αi, βi) must be

133

0 5 10 15 200

2

4

6

8

10

12

14

16P C(w)

A1 − µ A1 A2 − µ

w

Figure 5-1. Lifting function PC(w) of (5–44)

0

0.5

1

0

0.5

10

5

10

15

xy 0

0.5

1

0

0.5

10

5

10

15

xy

(a) (b)

Figure 5-2. Deriving lifting coefficients for Example 5.3

chosen in such a way that the plane αixi + βiyi overestimates the function PC(aixiyi)

over {0, 1} × [0, 1]. Possible overestimating planes are represented in Figure 6-2 (b).

A similar geometric interpretation was already used in Richard and Tawarmalani [100]

to obtain lifted inequalities for mixed-integer bilinear knapsack sets. It follows that an

134

overestimating plane αixi + βiyi can be obtained by first deriving a concave envelope p(w)

of PC(w) over [0, ai]. This observation motivates the following result.

Lemma 5.1. Assume that ai > 0. Define

qi :=

0 if ai < A1 − µ,

j if Aj − µ ≤ ai ≤ Aj+1 − µ, j = 1, . . . , q − 1,

q if Aq − µ < ai.

Let Qi0 = 0, Qi

j = Aj−µ for j = 1, . . . , qi and Qiqi+1 = ai. Further, redefine aqi+1 = ai−Qi

qi.

Define pi0(w) = w and pij(w) = PC(Qij) +

PC(Qij+1)−PC(Qi

j)

aj+1(w −Qi

j) for j = 1, . . . , qi. Then,

the function

p(w) := min{pij(w)

∣∣∣ j ∈ {0, . . . , qi}}

(5–46)

is a concave overestimator of PC(w) over [0, ai].

Proof. It is easily seen that p(w) is concave since p(w) is defined as the minimum of

a finite number of linear functions. If p(w) = pi0(w) = w, then it is clear that w

overestimates PC(w) since PC(w) is a continuous piecewise linear function with slopes

0 and 1. Note that the slope of pij(w) is no less than that of pij′(w) for j < j′ since

aj+1 ≥ aj′+1 impliesaj+1−µ

aj+1≥ aj′+1−µ

aj′+1. Therefore, the minimum of (5–46) is attained

at j = l if w ∈ [Qil, Q

il+1] for l ∈ {1, . . . , qi}. Further, since pil(Q

il) = PC(Qi

l),

pil(Qil+1) = PC(Qi

l+1), and PC(w) is convex for w ∈ [Qil, Q

il+1], we conclude that

p(w) = pil(w) ≥ PC(w).

Observe that in Lemma 5.1, qi represents the index of the interval to which ai

belongs. Next, we compute the lifting coefficients for the variables (xi, yi) for i ∈ M using

the subadditivity of PC(w) and the result of Lemma 5.1.

Theorem 5.1. Under Assumptions (A1), (A2), and (A3), the lifted bilinear cover

inequality

∑j∈C


ajyj +∑i∈M

αixi +∑i∈M

βiyi ≥∑j∈C

(aj − µ)+ (5–47)

135

is facet-defining for PB if

(αi, βi) ∈ (0, ai)⋃(

PC(ai), 0) qi⋃j=1

(PC(Qi

j)−PC(Qi

j+1)− PC(Qij)

aj+1

Qij,PC(Qi

j+1)− PC(Qij)

aj+1

ai

)

for i ∈ M in (5–47) where Qij and qi are as defined in Lemma 5.1.

Proof. Because PC(w) is subadditive over R+, we know that (5–47) is valid for PB if the

lifting coefficients (αi, βi) of (xi, yi) for i ∈ M are chosen to satisfy the condition

αixi + βiyi ≥ PC(aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–48)

Condition (5–48) can be rewritten as:

βiφ ≥ PC(0) for 0 < φ ≤ 1, (5–49)

αi + βiφ ≥ PC(aiφ) for 0 ≤ φ ≤ 1. (5–50)

To prove that (5–48) is facet-defining for PB, we also need to show two points (xi, yi) for

which (5–47) is satisfied at equality. First, consider the case where (αi, βi) = (0, ai). Since

αi + βiφ = βiφ = aiφ ≥ PC(aiφ) ≥ PC(0), (5–49) and (5–50) are satisfied. Further,

we see that (5–48) is satisfied at equality at the two points (1, 0) and(1,min

{1, A1−µ

ai

})

since PC(0) = 0 and PC(w) = w where 0 ≤ w ≤ A1 − µ. Next, consider the case where

(αi, βi) = (PC(ai), 0). Condition (5–49) is satisfied since βi = 0 and PC(0) = 0. Condition

(5–50) also holds because αi = PC(ai) and PC(w) is non-decreasing for w ∈ R+. Further,

(5–48) is satisfied at equality at the two points, (0, φ) for some 0 < φ < 1 and (1, 1).

Finally, consider

(αi, βi) =

(PC(Qi

j)−PC(Qi

j+1)− PC(Qij)

aj+1

Qij,PC(Qi

j+1)− PC(Qij)

aj+1

ai

)

136

for i ∈ N \ C. Clearly, (αi, βi) satisfies (5–49) since βi ≥ 0 and PC(0) = 0. From

Lemma 5.1, we have that

PC(aiφ) ≤ PC(Qij) +

PC(Qij+1)−PC(Qi

j)

aj+1

(aiφ−Qi

j

)

=(PC(Qi

j)−PC(Qi

j+1)−PC(Qij)

aj+1Qi

j

)+

PC(Qij+1)−PC(Qi

j)

aj+1aiφ

= αi + βiφ,

showing that (αi, βi) satisfy (5–50). Further, (5–48) is satisfied at equality at the two

points(1,

Qij

ai

)and

(1,

Qij+1

ai

). Therefore, we conclude that (5–47) is facet-defining for

PB.

Note that since we typically have several choices for the values of the lifting coefficient

(αi, βi), the family of inequalities (5–47) contain an exponential number of members. We

illustrate this characteristics of lifted bilinear cover inequalities in Example 5.3.

Example 5.3. In Example 5.2, we established that (5–44) is facet-defining for PB(M, ∅,M, ∅)using the result of Proposition 5.16. The lifting function PC(w) was also obtained in

closed-form. Applying Theorem 5.1, we obtain the nine inequalities

21y1

5x1 +212y1

15x1

+

19y2

509x2 +76

9y2

14x2

+ 17y3 + 10x4 + 5x5 ≥ 15

which are all facet-defining for PB. The reason that there are three choices for the lifting

coefficients of (x1, y1) is illustrated in Figure 6-2(b). The fact that there are three choices

for (x2, y2) follows similarly since coefficient a2 falls in the second interval.

5.3.2.2 Lifted reverse bilinear cover inequalities

In Theorem 5.1, we derived lifted bilinear cover inequalities by first lifting the

variables (xj, yj) for j ∈ C \ {l} and then lifting the remaining variables (xj, yj) for j ∈ M .

Here, we derive another family of lifted inequalities that we call lifted reverse bilinear cover

inequalities by changing the lifting order: we start the lifting procedure with the same

seed inequality (5–35), but we now lift the variables (xj, yj) for j ∈ M before the variables

137

(xj, yj) for j ∈ C \ {l}. In this case, we can relax some assumptions on the partition

(C,M, T ). In particular, we replace (A2) with

(A2’) al > µ for some l ∈ C.

Proposition 5.18. Suppose that Assumptions (A1), (A2’), and (A3) hold. Then,

(al − µ)xl +∑j∈M

min{aj, al − µ}xj +∑j∈T

ajyj ≥ al − µ (5–51)

is facet-defining for PB(∅, C \ {l}, ∅, C \ {l}).

Proof. It follows from Proposition 5.9 that

(al − µ)xl +∑j∈T

ajyj ≥ al − µ

is facet-defining for PB(M,C \ {l},M,C \ {l}). Its lifting function P (w) is derived in

Proposition 5.14 where it is also proven to be subadditive over R+. Therefore, lifting

coefficients (αi, βi) for (xi, yi) for i ∈ M are valid if they satisfy the condition:

αixi + βiyi ≥ P (aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–52)


βiφ ≥ P (0) for 0 < φ ≤ 1, (5–53)

αi + βiφ ≥ P (aiφ) for 0 ≤ φ ≤ 1. (5–54)

We now show that (αi, βi) = (min{ai, al − µ}, 0) are valid lifting coefficients. Clearly,

βi = 0 satisfies (5–53) since P (0) = 0. Further, since P (aiφ) = min{aiφ, al − µ}, it is alsoclear that αi = min{ai, al − µ} ≥ min{aiφ, al − µ} = P (aiφ). To show that (5–51) is

facet-defining for PB(∅, C \ {l}, ∅, C \ {l}), it suffices to verify that the two points (1, 0)

and (1, 1) satisfy (5–52) at equality.

We emphasize that the above proof requires that Assumptions (A2’) holds for

l ∈ argmin{aj | j ∈ C}, and not for l ∈ argmax{aj | j ∈ C} as Assumption (A2) in

Proposition 5.15. We also mention that lifting coefficients (αi, βi) = (0, ai) for i ∈ M are

138

valid for (5–52). These coefficients yield facet-defining inequalities for PB(∅, C \ {l}, ∅, C \{l}) because (5–52) is satisfied at equality for (1, 0) and

(1,min

{1, al−µ

ai

}). However,

these variables could have been treated directly as elements of T in (5–35) since adding

more elements to T does not violated Assumption (A3).

To obtain facet-defining inequalities for PB, we lift the remaining variables (xj, yj) for

j ∈ C \ {l} in (5–51). To this end, we first compute the function

PM(w) := min

{(al − µ)xl +

∑j∈M


ajyj

}− (al − µ)

s.t. alxlyl +∑

j∈M∪Tajxjyj ≥ al − µ+ w (5–55)

xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ {l} ∪M ∪ T.

It is easily verified that the lifting function PM(w) corresponding with (5–51) satisfies

PM(w) = −PM(−w). Let M = M1 ∪ M2 where M1 = {i ∈ M | ai > al − µ} and

M2 = M \ M1. Assume without loss of generality that {l} ∪ M1 = {1, . . . , q} and

a1 ≥ a2 ≥ . . . ≥ aq where q = |M1| + 1. Further, define A0 = 0 and Ai =∑i

j=1 aj for

all i = 1, . . . , q. Observe that al +∑

j∈M∪T aj = Aq +∑

j∈M2aj +

∑j∈T aj. We derive a

closed-form expression for PM(w) in the following proposition.


PM(w) =

−al + µ if w < −al + µ,

w − Ai + i(al − µ) if Ai − al + µ ≤ w < Ai, i = 0, . . . , q − 1,

i(al − µ) if Ai ≤ w < Ai+1 − al + µ, i = 0, . . . , q − 1,

w − Aq + q(al − µ) if Aq − al + µ ≤ w ≤ ∑j∈N aj − d,

∞ if∑

j∈N aj − d < w.

Proof. First, we observe that there exists an optimal solution (x∗, y∗) to (5–55) in which

x∗j = 1 for j ∈ T and y∗j = 1 for j ∈ M ∪ {l} since the objective coefficients corresponding

to these variables are zero. Using the same notations a =∑

j∈T aj and y =∑

j∈T ajyj

aas in

139

the proof of Proposition 5.14, we simplify the expression of PM(w) as:

PM(w) = min

∑

j∈{l}∪M1

(al − µ)xj +∑j∈M2

ajxj + ay

− (al − µ)

s.t.∑

j∈{l}∪M1

ajxj +∑j∈M2

ajxj + ay ≥ al − µ+ w (5–56)

xj ∈ {0, 1} ∀j ∈ {l} ∪M1 ∪M2, y ∈ [0, 1].

Further, after introducing a =∑

j∈M2aj + a and y =

∑j∈M2

ajxj+ay

a, we claim that PM(w)

can be written as:

PM(w) = min

∑

j∈{l}∪M1

(al − µ)xj + ay

− (al − µ)

s.t.∑

j∈{l}∪M1

ajxj + ay ≥ al − µ+ w (5–57)

xj ∈ {0, 1} ∀j ∈ {l} ∪M1, y ∈ [0, 1].

We next show that (5–56) and (5–57) are equivalent. To do so, we show that (5–56) has a

feasible solution (x∗M1

, x∗M2

, y∗) with objective value ζ∗ if and only if (5–57) has a feasible

solution (x∗M1

, y∗) of the objective value ζ∗. On the one hand, given (x∗M1

, x∗M2

, y∗), we can

obtain (x∗M1

, y∗) directly from the definition of y. The objective values of these two points

are identical. On the other hand, observe that a =∑

j∈T aj > al − µ ≥ ai for all i ∈ M2

because of Assumption (A3) and the definition of M2. Let M2 = {a1, . . . , ar} and assume

without loss of generality that a1 ≥ . . . ≥ ar. Further, define A0 = 0 and Ai =∑i

j=1 aj

for i = 1, . . . , r. Then, for a given (x∗M1

, y∗), we obtain (x∗M1

, x∗M2

, y∗) as follows. For

m = max{i ∈ M2 | Ai ≤ ay∗}, set x∗j = 1 for j ≤ m, x∗

j = 0 otherwise, and y∗ = ay∗−Am

a.

This solution is feasible since 0 ≤ ay∗−Am

a≤ 1 and

∑j∈{l}∪M1

ajx∗j +

∑j∈M2

ajx∗j + ay∗ =

∑j∈{l}∪M1

ajx∗j + Am + ay∗ − Am

=∑

j∈{l}∪M1ajx

∗j + ay∗.

It also has the same objective value of (x∗M1

, y∗).

140

Next, we compute an optimal solution for (5–57). We claim that there exists an

optimal solution (x∗, y∗) to (5–57) in which x∗1 ≥ x∗

2 ≥ . . . ≥ x∗q. For x

∗ 6= 0, let

t = max{j ∈ {1, . . . , q} | x∗j = 1}. Assume that we are given an optimal solution (x∗, y∗)

for which x∗i < x∗

t and i < t for some i, t ∈ {1 . . . , q}. In this case, we can find another

solution (x, y∗) with objective value at least as good as that of (x∗, y∗) as follows. We

define x as xk = x∗k if k 6= i and k 6= t, xi = x∗

t , and xt = x∗i . The solution (x, y∗) is feasible

and has the same objective value since ai ≥ at.

We now obtain a closed-form for PM(w) by solving (5–57). Consider the case where

w < −al + µ. In this case, the optimal solution is x∗j = 0 for j ∈ {l} ∪ M1, and y∗ = 0,

which implies that PM(w) = −al + µ. When −al + µ ≤ w < 0, the optimal solution

is x∗j = 0 for j ∈ {l} ∪ M1, and y∗ = al−µ+w

a, which implies that PM(w) = w. When

0 ≤ w < A1 − al +µ, the optimal solution is x∗1 = 1 and x∗

j = 0 for j ∈ {l} ∪M1, j 6= 1, and

y∗ = 0, which implies that PM(w) = 0. If w >∑

j∈N\C aj + µ =∑

j∈N aj − d, then PM(w)

is infeasible, which implies that PM(w) = ∞. If 0 ≤ w < Aq − al + µ, the following feasible

solution:

x∗j =

1 if j = 1, . . . ,m,

0 if j = m+ 1 and w < Am,

1 if j = m+ 1 and w ≥ Am,

0 if j = m+ 2, . . . , q,

and y∗m =

w−Am+al−µa

if w < Am,

0 if w ≥ Am,

is optimal where m = max{i ∈ {l} ∪M1 | Ai − al + µ ≤ w}. Note that a + Aq − al + µ =∑

j∈N\C aj + µ. If Aq − al + µ ≤ w ≤ ∑j∈N aj − d, then the optimal solution is x∗

j = 1 for

all j ∈ {l} ∪ M1 and y∗ = w−Aq+al−µ

a, which implies that PM(w) = w − Aq + q(al − µ).

Using these optimal solutions, we obtain the desired form for PM(w).

To apply Proposition 5.13, we must verify that the function PM(w) is subadditive

over R−. Since PM(w) = −PM(−w), this corresponds to verifying that PM(w) is

superadditive over R+.

141

Proposition 5.20. The lifting function PM(w) is superadditive over [0,∑

j∈N aj − d].

Proof. We show that PM(w) is superadditive using the result of Proposition 27 in Richard

and Tawarmalani [100]. Let λ = al − µ, r = q, Ci = Ai for i = 1, . . . , q, and P(w) = PM (w)al−µ

.

It is clear that Ai + Aj ≥ Ai+j for 0 ≤ i ≤ j ≤ q with i + j ≤ q since Ai is the sum of the

largest i coefficients in M1 ∪ {l}. Therefore, we conclude that PM(w) is superadditive over

[0,∑

j∈N aj − d].

We next illustrate the results of Propositions 5.18, 5.19, and 5.20 on an example.

Example 5.4. For the set B of Example 5.2, consider the partition (C,M, T ) =

({3, 4}, {5}, {1, 2}). This partition satisfies Assumptions (A1), (A2’) and (A3) for

l = 4 ∈ C since C is a cover with µ = 12 such that a4 > µ and∑

j∈C∪T aj =

21 + 19 + 17 + 15 > 20 + 15 = d+ al. We obtain from Proposition 5.18 that

3x4 + 3x5 + 21y1 + 19y2 ≥ 3 (5–58)

is facet-defining for PB(∅, C \ {l}, ∅, C \ {l}). Further, the lifting function PM(w) is given

by

PM(w) =

0 if 0 ≤ w < 12,

w − 12 if 12 ≤ w < 15,

3 if 15 ≤ w < 22,

w − 19 if 22 ≤ w < 62,

as described in Proposition 5.19.

Similar to Theorem 5.1, we compute the lifting coefficients for the variables (xi, yi) for

i ∈ C \ {l} using Proposition 5.13.

Theorem 5.2. Suppose that Assumptions (A1), (A2’), and (A3) hold. Then, the lifted

reverse bilinear cover inequality

(al − µ)xl +∑

i∈C\{l}PM(ai)xi (5–59)

142

+∑j∈M


ajyj ≥ (al − µ) +∑

i∈C\{l} PM(ai)


Proof. Since PM(w) is superadditive over w ∈ [0,∑

j∈N aj − d], the lifting coefficients

(αi, βi) of the variables (xi, yi) for i ∈ C \ {l} are valid if they satisfy the condition:

αi(1− xi) + βi(1− yi) ≤ PM(ai − aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {1, 1}. (5–60)

The above condition can be rewritten as:

βi ≤ inf0≤φ<1

PM(ai − aiφ)

1− φ, (5–61)

αi + sup0≤φ≤1

βi(1− φ) ≤ PM(ai). (5–62)

Because of Assumption 5.2 and C ⊆ N , we know that ai ≤∑

j∈N aj − d for all i ∈ C.

Observe that

inf0≤φ<1

PM(ai − aiφ)

1− φ= 0

since PM(ε) = 0 for sufficiently small ε > 0. Therefore, choosing βi = 0 satisfies (5–61).

Moreover, as βi = 0, it is easily verified that choosing αi = PM(ai) satisfies (5–62). Finally,

note that (5–60) is tight for the two points (0, 0) and(1, (ai−A1+al−µ)+

ai

), which proves that

(5–59) is facet-defining for PB.

Note that a lifted reverse bilinear cover inequality (5–59) does not yield an exponential

number of facet-defining inequalities. We will illustrate the reason in Example 5.5. This is

a significant difference from lifted bilinear cover inequalities (5–47). Moreover, we observe

that some of inequalities (5–59) might also be explained as lifted bilinear cover inequalities

(5–47). However, there exist inequalities (5–59) that cannot be obtained as one of the

lifted bilinear cover inequalities (5–47). We will illustrate this difference on the following

example.

Example 5.5. For the partition (C,M, T ) = ({3, 4}, {5}, {1, 2}), we established in

Example 5.4 that (5–58) is facet-defining for PB(∅, C \ {l}, ∅, C \ {l}). Further, the lifting

143

function PM(w) was obtained in closed-form and is depicted in Figure 5.5 (a). Applying

Theorem 5.2, we obtain the following lifted reverse bilinear cover inequality

3x3 + 3x4 + 3x5 + 21y1 + 19y2 ≥ 6

which is facet-defining for PB. We observe in Figure 5.5 (b) that this is the only choice

of coefficients that yields the plane underestimating PM(ai − aixiyi) over (xi, yi) ∈{0, 1} × [0, 1] \ {1, 1}. Further, this inequality cannot be obtained as a lifted bilinear cover

inequality (5–47). This is because the coefficients of the binary variables xi are equal while

this could only happen in lifted bilinear cover inequalities (5–47) when (aj − µ)+ = 0 for

j ∈ C.

0

0.5

1

0

0.5

10

1

2

3

4

5

xy

0

0.5

1

0

0.5

10

1

2

3

4

5

xy

(a) (b)


5.3.3 Inequalities through Approximate Lifting

We now derive another family of lifted inequalities from the seed inequality developed

in Proposition 5.11. To this end, we first identify a cover C ⊆ N that satisfies the

condition of Proposition 5.11. In particular, we assume that

(C1)∑

j∈C aj − ak ≥ d for all k ∈ C, i.e., al = maxj∈C aj ≤ µ,

144

(C2)∑

j∈C aj − ak − am < d for any k,m ∈ C, i.e., ak + am > µ for all k,m ∈ C.

In the following discussions, we consider covers C with |C| ≥ 2 since we require

Assumptions (C1) and (C2) to be satisfied. When fixing the variables (xi, yi) for i ∈ N \ Cto (0, 0), it follows from Proposition 5.11 that

∑j∈C

xj ≥ |C| − 1 (5–63)

is facet-defining for PB(N \ C, ∅, N \ C, ∅).We now lift the remaining variables (xi, yi) for i ∈ N \ C. The lifting function

corresponding to (5–63) is defined as

Φ(w) := max (|C| − 1)−∑j∈C

xj

s.t.∑j∈C

ajxjyj ≥ d− w (5–64)

xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ C.

We assume without loss of generality that C = {1, . . . , r} and that a1 ≤ a2 ≤ . . . ≤ ar.

We define µ = a1 + a2 − µ, B0 = 0 and Bi =∑i

j=1 aj+2 for i = 1, . . . , r − 2. Observe that

µ > 0 and a1 + a2 +Br−2 = d+ µ where µ is the excess of cover C.


Φ(w) =

0 if 0 ≤ w < µ,

i+ 1 if Bi + µ ≤ w < Bi+1 + µ, i = 0, . . . , r − 3,

r − 1 if Br−2 + µ ≤ w.

Proof. There exists an optimal solution (x∗, y∗) to (5–64) in which y∗j = 1 for j ∈ C since

all the objective coefficients corresponding to these variables are zero. Hence, Φ(w) can be

rewritten as

Φ(w) = max (|C| − 1)−∑j∈C

xj

145

s.t.∑j∈C

ajxj ≥ d− w (5–65)

xj ∈ {0, 1} ∀j ∈ C.

Further, we claim that there exists an optimal solution x∗ to (5–65) in which

x∗1 ≤ x∗

2 ≤ . . . ≤ x∗r. (5–66)

This is because, given any optimal solution x∗ to (5–65) with x∗i > x∗

j for i < j, the

solution x defined as xk = x∗k if k 6= i and k 6= j, xi = x∗

j , and xj = x∗i , is feasible and

has the same objective value. Now, we compute Φ(w) by solving (5–65). If 0 ≤ w < µ, it

is clear that x∗1 = 0 and x∗

j = 1 for j = 2, . . . , r. For the remaining case, it follows from

(5–66) that

x∗j =

0 if j = 1, . . . , t+ 2,

1 if j = t+ 3, . . . , r,

where t := max{i ∈ {0, . . . , r − 2} | Bi + µ ≤ w} is an optimal solution to (5–65), which

shows the result.

We now perform sequential lifting of the pair of variables (xi, yi) for i ∈ N \ C. We

assume that all variables (xj, yj) for j ∈ N \C where j < i have already been lifted. Lifting

coefficients will be derived from the lifting functions

Φi(w) = max (|C| − 1)−∑j∈C

xj +∑

1≤j<i,j∈N\C(αjxj + βjyj)

s.t.∑j∈C

ajxjyj +∑

1≤j<i,j∈N\Cajxjyj ≥ d− w

xj ∈ {0, 1}, yj ∈ [0, 1] ∀j ∈ N.

In Section 5.3.2.1, we were able to show that lifting functions were subadditive, a

property that leads to sequence-independent lifting. Unfortunately, the lifting function

Φ(w) is not subadditive. To handle such situations, Gu et al. [62] proposed to use

146

approximate lifting. Following their approach, we say that Ψ(w) is a valid subaddi-

tive approximation of Φ(w) if Ψ(w) ≥ Φ(w) for all w ∈ R and Ψ(w) is subadditive.

We say that a valid subadditive approximation Ψ(w) is nondominated if there is no

other valid subadditive approximation Ψ′(w) with Ψ′(w) ≤ Ψ(w) for all w ∈ R

and Ψ′(w) < Ψ(w) for some w ∈ R. We also define the notion of maximal set

E = {w ∈ R+ | Φi(w) = Φ(w) ∀i ∈ N \ C, and for all lifting orders}. A valid

subadditive approximation Ψ(w) of Φ(w) is called maximal if Ψ(w) = Φ(w) for all w ∈ E.

It is clear that a maximal nondominated approximation of Φ leads to strong inequalities

that can be obtained efficiently for PB. To construct such an approximation of Φ(w), we

will use the following proposition.

Proposition 5.22. Let λ and Ci for i = 0, 1, . . . , s be nonnegative integers. Assume that

λ > 0, C0 = 0 and Ci−1 + λ ≤ Ci for i = 1, . . . , s. Then the function

h(w) =

i+ w−Ci

λif Ci ≤ w < Ci + λ, i = 0, . . . , s,

i if Ci−1 + λ ≤ w < Ci, i = 1, . . . , s,

s+ 1 if Cs + λ ≤ w.

is subadditive over R+ if and only if Ci + Cj ≤ Ci+j for 0 ≤ i ≤ j ≤ s with i+ j ≤ s.

Proof. Assume that Ci + Cj ≤ Ci+j for 0 ≤ i ≤ j ≤ s with i + j ≤ s. We want to prove

that h(x) + h(y) ≥ h(x + y) for x, y ∈ R+. Assume for a contradiction that there exists

x, y ∈ R+ such that h(x) + h(y) < h(x + y). We claim first that there exists x′, y′ ∈ R+

with x′ = Ci for some i ∈ {0, . . . , s} such that h(x′) + h(y′) < h(x′ + y′). Consider the

following three cases:

1. If Ci ≤ x ≤ Ci + λ for i ∈ {0, . . . , s}, then let x′ = Ci and y′ = y. Clearly, h(x′) =

h(x)+ Ci−xλ

and h(y′) = h(y). Further, h(x′+y′) = h(x+y+Ci−x) ≥ h(x+y)+ Ci−xλ

since Ci ≤ x and the function h has the slope of 0 or 1λ. Therefore, we have that

h(x′) + h(y′) = h(x) + h(y) + Ci−xλ

< h(x+ y) + Ci−xλ

≤ h(x′ + y′).

147

2. If Ci−1 + λ ≤ x ≤ Ci for i ∈ {1, . . . , s}, then let x′ = Ci and y′ = y + x− Ci. Clearly,

h(x′) = h(x) and h(y′) ≤ h(y) since x ≤ Ci and h is nondecreasing. Therefore, we

have that h(x′) + h(y′) ≤ h(x) + h(y) < h(x+ y) = h(x′ + y′).

3. If Cs+λ ≤ x, then let x′ = Cs+λ and y′ = y. Clearly, h(x′) = h(x) and h(y′) = h(y).

Further, h(x′ + y′) = h(Cs + λ + y) = h(x + y) since y ≥ 0 and x + y ≥ Cs + λ.

Therefore, we have that h(x′) + h(y′) = h(x) + h(y) < h(x+ y) = h(x′ + y′).

We claim next that there exists x, y ∈ R+ with x = Ci and y = Cj for some i, j ∈{0, . . . , s} such that h(x) + h(y) < h(x+ y). Consider the following three cases:

1. If Cj ≤ y′ ≤ Cj + λ for j ∈ {0, . . . , s}, then let x = x′ and y = Cj. Clearly,

h(x) = h(x′) and h(y) = h(y′) + Cj−y′

λ. Further, h(x + y) = h(x′ + y′ + Cj − y′) ≥

h(x′ + y′) + Cj−y′

λsince Cj ≤ y′ and the function h has the slope of 0 or 1

λ. Therefore,

we have that h(x) + h(y) = h(x′) + h(y′) + Cj−y′

λ< h(x′ + y′) + Cj−y′

λ≤ h(x+ y).

2. If Cj−1 + λ ≤ y′ ≤ Cj for j ∈ {1, . . . , s}, then let x = x′ and y = Cj. Clearly,

h(x) = h(x′) and h(y) = h(y′). Further, h(x + y) ≥ h(x′ + y′) since y′ ≤ Cj and h is

nondecreasing. Therefore, we have that h(x) + h(y) = h(x′) + h(y′) < h(x′ + y′) ≤h(x+ y).

3. If Cs+λ ≤ y′, then let x = x′ and y = Cs+λ. Clearly, h(x) = h(x′) and h(y) = h(y′).

Further, h(x+ y) = h(x′ + Cs) = h(x′ + y′) since x′ ≥ 0 and x′ + y′ ≥ Cs. Therefore,

we have that h(x) + h(y) = h(x′) + h(y′) < h(x′ + y′) = h(x+ y).

We conclude that there exists i, j ∈ {0, . . . , s} such that h(Ci) + h(Cj) < h(Ci + Cj). Since

h(Ci) = i and h(Cj) = j, we have that i+ j < h(Ci+Cj). Since i+ j < h(Ci+Cj) and h is

nondecreasing, we conclude that Ci+j < Ci + Cj, which is a contradiction to the hypothesis

Ci + Cj ≤ Ci+j.

To prove the reverse implication, assume now that h is subadditive. We want to

prove that Ci + Cj ≤ Ci+j for 0 ≤ i ≤ j ≤ s with i + j ≤ s. As shown before, we can

take i and j such that h(Ci) = i and h(Cj) = j. Since h is subadditive, it follows that

148

i + j = h(Ci) + h(Cj) ≥ h(Ci + Cj). Since i + j ≤ s, we conclude that Ci+j ≥ Ci + Cj,

which proves the result.

We next describe a strong valid subadditive approximation of Φ(w). The proof

technique is similar to that used in Gu et al. [62].

Theorem 5.3. The function

Ψ(w) :=

i+ w−Bi

µif Bi ≤ w < Bi + µ, i = 0, . . . , r − 2,

i if Bi−1 + µ ≤ w < Bi, i = 1, . . . , r − 2,

r − 1 if Br−2 + µ ≤ w,

is a valid subadditive approximation of Φ(w) that is nondominated and maximal over R+.

Proof. Note that Ψ(w) = Φ(w) when w ∈ [Bi−1 + µ, Bi] for some i ∈ {1, . . . , r − 2} and

when w ≥ Br−2 + µ. Further,

Ψ(w) = Φ(w) +w −Bi

µ≥ Φ(w)

when w ∈ (Bi, Bi+µ). Next, we show that Ψ(w) is subadditive over R+. In Proposition 5.22,

let s = r − 2, Ci = Bi and λ = µ. Since Bi is the sum of the smallest i coefficients except

for the two coefficients a1 and a2 in the cover C, it is clear that Bi + Bj ≤ Bi+j for

0 ≤ i ≤ j ≤ r with i + j ≤ r. Therefore, Ψ(w) is subadditive over R+. Second, we

show that Ψ(w) is nondominated. Assume for a contradiction that there is another valid

subadditive approximation Ψ′(w) such that Φ(w) ≤ Ψ′(w) ≤ Ψ(w) for all w ≥ 0 and for

which there exists w′ ≥ 0 with Ψ′(w′) < Ψ(w′). Then, it must be that w′ ∈ (Bi, Bi + µ)

for some i ∈ {0, . . . , r − 2}. Let w′′ = Bi + µ − w′. Since 0 < w′′ < µ, we have that

0 < Ψ(w′′) < 1. Further,

Ψ(w′)+Ψ(w′′) = i+w′ −Bi

µ+w′′

µ= i+1 = Φ(Bi+ µ) = Ψ′(Bi+ µ) = Ψ′(w′+w′′). (5–67)

Hence, we obtain that

Ψ(w′′) = Ψ′(w′ + w′′)−Ψ(w′) ≤ Ψ′(w′) + Ψ′(w′′)−Ψ(w′) < Ψ′(w′′),

149

where the first equality holds because of (5–67), the first inequality holds because Ψ′

is subadditive and the second inequality holds because Ψ′(w′) < Ψ(w′). This is a

contradiction to the assumption that Ψ′(w) ≤ Ψ(w) for w ∈ R+. Finally, we prove

that Ψ(w) is maximal. Assume without loss of generality that N \ C = {r + 1, . . . , n}and that (xr+1, yr+1) is lifted first. Clearly, Φr+1(w) = Φ(w) for w ∈ R+. Assume that

Ψ(w) > Φ(w) for some w ≥ 0. Then, it suffices to show that there exists a coefficient ar+1

for which Φr+2(w) > Φ(w). We first claim that if Ψ(w) > Φ(w) for some w ≥ 0, then there

exists w′ ≥ 0 such that Φ(w) + Φ(w′) < Φ(w + w′). Since Ψ(w) > Φ(w) for some w ≥ 0,

it is clear that w ∈ (Bi, Bi + µ) for some i and Φ(w) = i. Let w′ = Bi + µ − w. Since

0 < w′ < µ, Φ(w′) = 0 and Φ(w+w′) = i+1. Hence, Φ(w)+Φ(w′) = i < i+1 = Φ(w+w′).

Next, assume that (xr+1, yr+1) is lifted first and that its coefficient ar+1 = w′. Then,

Φr+2(w) = max{Φ(w + w′)− Φ(w′),Φ(w)} = Φ(w + w′)− Φ(w′) > Φ(w), which shows the

result.

In Figure 5-4, we present the valid subadditive approximation Ψ(w) of Φ(w) obtained

using Proposition 5.3 for inequality (5–74) discussed in Example 5.6. Observe that, for

0 < w ≤ µ, approximation is exact only when w = µ, i.e., Ψ(µ) = Φ(µ). For w ≥ µ, it

is clear that approximation is exact when µ ≤ w ≤ B1 or w ≥ B1 + µ. Next, we obtain

a concave overestimator of Ψ(w) in Lemma 5.2 that will be used to compute the lifting

coefficients of the remaining variables in Theorem 5.4.

Lemma 5.2. Assume that ai > 0. Define

wi :=

0 if ai < µ,

j + 1 if Bj + µ ≤ ai < Bj+1 + µ, j = 0, . . . , r − 3,

r − 1 if Br−2 + µ ≤ ai.

Let W i0 = 0, W i

j = Bj + µ for j = 1, . . . , wi and W iwi+1 = ai. Further, redefine

awi+2 = ai − W iwi. Define ψi

0(w) =wµand ψi

j(w) = Ψ(W ij ) +

Ψ(W ij+1)−Ψ(W i

j )

aj+2(w − W i

j ) for

150

0 5 10 15 200

0.5

1

1.5

2

2.5

Φ(w)

Ψ(w)

µ B1 B1 + µ

w

Figure 5-4. A valid subadditive approximation Ψ(w) of Φ(w) for Example 5.6.

j = 1, . . . , wi. Then, the function

ψ(w) := min{ψij(w)

∣∣∣ j ∈ {0, . . . , wi}}

(5–68)

is a concave overestimator of Ψ(w).

Proof. First, ψ(w) is concave since it is obtained as the minimum of a finite number of

affine functions. If ψ(w) = ψi0(w) = w

µ, then ψi

0(w) clearly overestimates Ψ(w). Note

that the slope of ψij(w) is no less than that of ψi

j′(w) for j < j′ since aj+2 ≤ aj′+2 implies

1aj+2

≥ 1aj′+2

. Therefore, the minimum of (5–68) is attained at j = l if w ∈ [W il ,W

il+1] for

l ∈ {1, . . . , wi}. Further, since ψil(W

il ) = Ψ(W i

l ), ψil(W

il+1) = Ψ(W i

l+1), and Ψ(w) is convex

for w ∈ [W il ,W

il+1], we conclude that ψ(w) = ψi

l(w) ≥ Ψ(w).

The concave overestimator ψ(w) of Lemma 5.2 can be used to obtain lifting

coefficients. In order to determine whether the resulting inequality is facet-defining,

we introduce the following notation.

For i ∈ N \ C, we define I(ai) to be the function that returns 0 if Φ(ai) = Ψ(ai) and

returns 1 otherwise, i.e.,

I(ai) :=

0 if Bwi−1 + µ ≤ ai < Bwior ai ≥ Br−2 + µ,

1 if Bwi≤ ai < Bwi

+ µ or ai < µ.

151

Theorem 5.4. Under Assumptions (C1) and (C2),

∑j∈C

xj +∑

i∈N\Cαixi +

∑

i∈N\Cβiyi ≥ |C| − 1 (5–69)

defines a face of PB of dimension at least (2n− 1)−∑i∈N\C I(ai) when

(αi, βi) ∈(0,

aiµ

)∪ (Ψ(ai), 0)

wi⋃j=1

(Ψ(W i

j )−Ψ(W i

j+1)−Ψ(W ij )

aj+2

W ij ,Ψ(W i

j+1)−Ψ(W ij )

aj+2

ai

)

(5–70)

where µ, W ij and wi are as defined in Lemma 5.2. In particular, if, for all i ∈ N \ C,

(i) I(ai) = 0, or

(ii) I(ai) = 1, ai ≥ µ, and

(αi, βi) ∈(0,

aiµ

) wi−1⋃j=1

(Ψ(W i

j )−Ψ(W i

j+1)−Ψ(W ij )

aj+2

W ij ,Ψ(W i

j+1)−Ψ(W ij )

aj+2

ai

),

then (5–69) is facet-defining for PB.

Proof. It follows from Theorem 5.3 that Ψ(w) is a valid subadditive approximation of

Φ(w) for w ≥ 0. Hence, lifted inequalities will be valid whenever the lifting coefficients

(αi, βi) of (xi, yi) for i ∈ N \ C satisfy the condition

αixi + βiyi ≥ Ψ(aixiyi) ≥ Φ(aixiyi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (5–71)

Condition (5–71) can be restated as

βiφ ≥ Ψ(0) ≥ Φ(0) for 0 < φ ≤ 1, (5–72)

αi + βiφ ≥ Ψ(aiφ) ≥ Φ(aiφ) for 0 ≤ φ ≤ 1. (5–73)

To prove that a lifted inequality defines a face of PB of dimension at least (2n − 1) −∑

i∈N\C I(ai) under Assumption (5–70), we show that at least one point (xi, yi) is always

tight in (5–71) and that two points (xi, yi) are tight in (5–71) if Assumptions (i) or (ii)

holds. First, consider the case where (αi, βi) =(0, ai

µ

). Since µ > 0 and αi + βiφ = βiφ =

aiµφ ≥ Ψ(aiφ) ≥ Ψ(0) ≥ Φ(0), (5–72) and (5–73) are clearly satisfied. (5–71) is always

152

satisfied at equality for the point (1, 0). Further, if Assumptions (i) or (ii) holds, (5–69) is

facet-defining since another tight point is added in the lifting procedure. More precisely,

since Ψ(w) = wµwhere 0 ≤ w ≤ µ, we see that aiφ

µ= Ψ(aiφ) = Φ(aiφ) if φ = min{1, µ

ai},

which shows (5–71) is satisfied at equality for the point(1, µ

ai

). Next, consider the case

where (αi, βi) = (Ψ(ai), 0). It is easily verified that (5–71) is satisfied at equality for the

point (0, 1) since βi = 0. Finally, consider

(αi, βi) =

(Ψ(W i

j )−Ψ(W i

j+1)−Ψ(W ij )

aj+2

W ij ,Ψ(W i

j+1)−Ψ(W ij )

aj+2

ai

)

for i ∈ N \ C. Clearly, (αi, βi) satisfies (5–72) since βi ≥ 0. From Lemma 5.2, we have that

Φ(aiφ) ≤ Ψ(aiφ) ≤ Ψ(W ij ) +

Ψ(W ij+1)−Ψ(W i

j )

aj+2(aiφ−W i

j )

=(Ψ(W i

j )−Ψ(W i

j+1)−Ψ(W ij )

aj+2W i

j

)+

Ψ(W ij+1)−Ψ(W i

j )

aj+2aiφ

= αi + βiφ.

We also verify that (5–71) is satisfied at equality for the two points(1,

W ij

ai

)and

(1,

W ij+1

ai

)

except for the case where j = wi if I(ai) = 0. Therefore, we conclude that (5–69) defines

a face of PB of dimension at least (2n − 1) −∑i∈N\C I(ai) and is facet-defining for PB if

Assumptions (i) or (ii) is satisfied for all i ∈ N \ C.

Inequalities (5–69) can be facet-defining depending on the value of the coefficients ai

and the choice of lifting coefficients (αi, βi) for i ∈ N \ C. We mention that inequality

(5–69) can be facet-defining even if the assumptions (i) and (ii) of Theorem 5.4 are not

satisfied. We present an example with this property next. A similar example is given for

the 0−1 knapsack polytope by Gu et al. [62]; see the example following Theorem 6.

Example 5.6. For the bilinear set B in Example 5.2, consider the cover C = {3, 4, 5}.We can easily verify that C satisfy Assumptions (C1) and (C2). It follows from Proposi-

tions 5.11 that

x3 + x4 + x5 ≥ 2 (5–74)

153

is facet-defining for B({1, 2}, ∅, {1, 2}, ∅). Using the result of Proposition 5.22, the valid

subadditive approximation Ψ(w) of the lifting function Φ(w) is given by:

Ψ(w) =

w3

if 0 ≤ w < 3,

1 if 3 ≤ w < 17,

2− 20−w3

if 17 ≤ w < 20,

2 if 20 ≤ w.

Applying Theorem 5.4, we obtain the following nine inequalities

213y1

1417x1 +21

17y1

2x1

+

193y2

2124x2 +19

24y2

53x2

+ x3 + x4 + x5 ≥ 2,

which define faces of PB of dimension at least 8. Since 2 ∈ N \ C and a2 > µ with

I(a2) = 0, it follows from Theorem 5.4 that three inequalities

213y1

1417x1 +21

17y1

2x1

+19

3y2 + x3 + x4 + x5 ≥ 2

are facet-defining for PB. The two inequalities

21

3y1 +

2124x2 +19

24y2

53x2

+ x3 + x4 + x5 ≥ 2

are also facet-defining for PB although they do not satisfy the assumptions in Theo-

rem 5.4.

5.4 New Facet-Defining Inequalities for a Single-node Flow Model

In Section 5.3, we derived strong valid inequalities for the bilinear set B using lifting.

In this section, we show that many of these lifted inequalities are also facet-defining for the

154

convex hull of the single-node flow model without inflows

F =

{(x, y) ∈ {0, 1}n × [0, 1]n

∣∣∣n∑

j=1

ajyj ≥ d, xj ≥ yj ∀j ∈ N

}.

To eliminate uninteresting cases, we assume that∑n

j=1 aj > d + ai for all i ∈ N . In the

following lemma, we first show that F ⊆ B.

Lemma 5.3. The bilinear covering set B is a relaxation of the single-node flow set F .

Proof. We prove that F ⊆ B. Let (x, y) be an arbitrary point of F . Clearly, x ∈ {0, 1}n

and y ∈ [0, 1]n. It therefore remains to show that∑n

j=1 ajxjyj ≥ d. Let N0 = {j ∈ N |xj = 0} and N1 = {j ∈ N | xj = 1}. Since (x, y) ∈ F , yj = 0 for ∀j ∈ N0. Then,

n∑j=1

ajxjyj =∑j∈N1

ajyj =n∑

j=1

ajyj ≥ d,

where the last inequality holds because (x, y) ∈ F .

The general single-node flow set is important in mixed-integer programming since it

can be used as a relaxation of any inequality in a 0−1 mixed-integer program. Further,

these sets naturally arise in fixed-charge network problems; see [6, 61, 81, 83, 97]. The

single-node flow set F without inflows was first studied by Padberg et al. [97] under the

assumptions that (i) ai ≤ d and (ii)∑n

j=1 aj > d + ai for all i ∈ N . In particular, the

authors show that the inequalities∑n

j=1 ajyj ≥ d, xi ≥ yi, xi ≤ 1, and yi ≥ 0 for all

i ∈ N are facets for PF := conv(F ). In the remainder of this section, we refer to these

inequalities as trivial facets of PF . Padberg et al. [97] also prove the following result.

Lemma 5.4 (Proposition 8 in Padberg et al. [97]). If αx + βy ≥ δ is a nontrivial facet of

PF , then α ≥ 0, β ≥ 0, and δ > 0.

Proposition 5.23. Let αx + βy ≥ δ be a nontrivial facet of PF . Then, αx + βy ≥ δ is

also facet-defining for PB.

Proof. To prove αx + βy ≥ δ defines a facet of PB, it suffices to show that αx + βy ≥ δ is

valid for B since F ⊆ B by Lemma 5.3. If (x, y) ∈ F , then it is clear that αx + βy ≥ δ.

155

Therefore, consider the case where (x, y) ∈ B \ F . Let N0 = {j ∈ N | xj = 0} and

N1 = {j ∈ N | xj = 1}. We have that∑n

j=1 ajxjyj =∑

j∈N1ajyj ≥ d, yj ≤ 1 ∀j ∈ N1.

Further, yi > 0 for some i ∈ N0 since otherwise (x, y) ∈ F . Define (x, y) such that x = x,

yj = yj ∀j ∈ N1, and yj = 0 ∀j ∈ N0. Since (x, y) ∈ F , we obtain that αx + βy ≥ δ.

Further, since β ≥ 0 from Lemma 5.4, we obtain that αx+βy ≥ αx+βy ≥ δ. We conclude

that αx+ βy ≥ δ is valid for PB.

We note that Proposition 5.23 is slightly suprising in the light of Lemma 5.3 because

on one hand F ( B and on the other hand the nontrivial facets of PF are facets of PB.

In other words, the structure of PF can be determined from PB by including the trivial

facets of PF . As a consequence of Proposition 5.23, we exploit studies of PF to obtain

facets of PB.

Example 5.7. Consider the single-node flow set

F ={(x, y) ∈ {0, 1}4 × [0, 1]4

∣∣∣ 19y1 + 17y2 + 15y3 + 10y4 ≥ 20, xj ≥ yj, j = 1, . . . , 4},

corresponding with the bilinear covering set B discussed in Example 5.1. We obtained the

linear description of PF using PORTA. This linear description is given in the Appendix.

We observe that inequalities (5–9),(5–10),(5–16), and (5–17) are facets for both PB and

PF . However, it can be verified that inequalities (5–11),(5–12),(5–14), and (5–15) are

facet-defining for PB but not for PF .

We mention that the inequalities of PF described in the Appendix have been

numbered according to their counterparts in PB. For the set F , Padberg et al. [97]

derived a family of facet-defining inequalities that we describe in the following proposition.

Proposition 5.24. (Adapted from Proposition 12 in Padberg et al. [97]) Assume that (i)

C is a cover with excess µ =∑

j∈C aj − d such that a = maxj∈C aj > µ and (ii) L ⊆ N \ Csuch that 0 < a− µ < ak ≤ a for all k ∈ L and

∑j∈N\L aj > d+ a. Then

∑j∈C

(aj − µ)+xj +∑j∈L

(a− µ)xj +∑

j∈N\(C∪L)ajyj ≥

∑j∈C

(aj − µ)+ (5–75)

156

is facet-defining for PF .

We next show that inequalities (5–75) can be obtained as lifted bilinear cover

inequalities (5–47).

Proposition 5.25. Inequality (5–75) can be obtained as a lifted bilinear cover inequality

(5–47) of PB.

Proof. Let C and L ⊆ N\C be given that satisfy conditions (i) and (ii) of Proposition 5.24.

Observe that∑

j∈N\(C∪L) aj > a − µ since∑

j∈N\L aj > d + a. Define C = C, M = L,and T = N \ (C ∪ L). Clearly, µ = µ and (C,M, T ) is a partition of N that satisfies

Assumptions (A1), (A2), and (A3) in Theorem 5.1. We obtain from Assumption (ii) that

A1 − µ < ai ≤ A1 < A2 − µ for i ∈ M , which implies that qi = 1 for all i ∈ M in

Lemma 5.1. Further, since Qi1 = A1 − µ and Qi

2 = ai for i ∈ M , we can select (αi, βi) as

(A1 − µ, 0). Since A1 − µ = a− µ, we obtain that (5–75) is a lifted bilinear cover inequality

(5–47).

We next characterize the lifted bilinear cover inequalities of PB that also yield facets

for PF .

Theorem 5.5. A lifted bilinear cover inequality (5–47) is facet-defining for PF if and only

if

(αi, βi) ∈ (0, ai)

qi⋃j=1

(PC(Qi

j)−PC(Qi

j+1)− PC(Qij)

aj+1

Qij,PC(Qi

j+1)− PC(Qij)

aj+1

ai

)(5–76)

for all i ∈ M .

Proof. It is obvious that (5–47) is valid for F since F ⊆ B as described in Lemma 5.3.

We show that (5–47) is facet-defining for PF if (αi, βi) for i ∈ M are selected as in

condition (5–76). Recall that (5–47) is obtained in Section 5.3 by lifting the seed

inequality (5–35) which is facet-defining for PB(M,C \ {l},M,C \ {l}). We prove

first that (5–35) is facet-defining for PF (M,C \ {l},M,C \ {l}). Consider the points:

p0 = (el, el), pj = (el + ej, el) for j ∈ T , q0 =

(el,

al−µal

el

), q1 =

(∑j∈T ej, d

∑j∈T ej

)and

157

qk =(∑

j∈T ej, d∑

j∈T ej + ε( 1ak−1

ek−1 − 1akek)

)for k = 2, . . . , |T | where d = al−µ∑

j∈T ajand

ε > 0. It can be easily verified that these points belong to PB(M,C \ {l},M,C \ {l}) andare affinely independent. Further, they also belong to F since yj ≤ xj for all j ∈ T and

yl ≤ xl. Therefore, (5–35) is facet-defining for PF (M,C \ {l},M,C \ {l}). Now, it suffices

to show that sufficiently many of the tight points added when lifting variables (xi, yi) for

i ∈ M ∪C \{l}. When we lift the variables (xi, yi) fixed at (1, 1) for i ∈ C \{l} in the proof

of Proposition 5.15, we add the two affinely independent points (0, 0) and(1, (ai−µ)+

ai

)

that both belong to F . When lifting the variables (xi, yi) fixed at (0, 0) for i ∈ M in

Theorem 5.1, we add two points that differ depending on the choice of coefficients (αi, βi).

In the case where (αi, βi) = (0, ai), we add the two points (1, 0) and(1,min{1, A1−µ

ai}),

which belong to F . For the remaining cases in (5–76), we add the two points(1,

Qij

ai

)and

(1,

Qij+1

ai

)that both belong to F .

Next, we show that if (5–47) is facet-defining for PF , then (αi, βi) must be chosen

as in (5–76). It suffices to show that if (αi, βi) = (PC(ai), 0) for some i ∈ M such that

PC(ai) 6= PC(Qiqi), then (5–47) is not facet-defining for PF . We will do so by showing that

in this case (5–47) can be obtained by combining another inequality of the form (5–47) for

PF and trivial facets yi ≤ xi of PF . Assume for simplicity that only one pair (αm, βm) of

lifting coefficients are chosen to be (PC(am), 0). If qm = 0 (i.e., 0 < am < A1 − µ), then

inequality (5–47) defined with (αm, βm) = (0, am) is

amym +∑i∈C

(ai − µ)+xi +∑j∈T

ajyj +∑

i∈M\{m}αixi +

∑

i∈M\{m}βiyi ≥

∑i∈C

(ai − µ)+ (5–77)

and is facet-defining for PF . Since PC(am) = am for 0 < am < A1 − µ, the inequality

(5–47) with (αm, βm) = (PC(am), 0) can be obtained by combining the inequality (5–77)

and am(xm − ym) ≥ 0. If qm > 0, then we have that Qmqm = Aqm − µ, Qm

qm+1 = am, and

amqm+1 = am −Qmqm . When Aqm < am < Aqm+1 − µ, inequality (5–47) reduces to:

(PC(Qm

qm)−PC(Qm

qm+1)− PC(Qmqm)

aqm+1

Qmqm

)xm

158

+

(PC(Qm

qm+1)− PC(Qmqm)

aqm+1

am

)ym +

∑i∈C

(ai − µ)+xi +∑j∈T

ajyj (5–78)

+∑

i∈M\{m}αixi +

∑

i∈M\{m}βiyi ≥

∑i∈C

(ai − µ)+

and is facet-defining for PF . Further, we can obtain inequality (5–47) with (αm, βm) =

(PC(am), 0) by combining (5–78) and

(PC(Qm

qm+1)− PC(Qmqm)

aqm+1

am

)(xm − ym) ≥ 0,

since PC(am) − PC(Qmqm) > 0. If there exist several m ∈ M with (αm, βm) = (PC(am), 0),

then we can apply the same idea for each m since the coefficients other than (xm, ym) are

unchanged at each step of the argument.

We show in the following example that the family of lifted bilinear cover inequalities

is larger than (5–75).

Example 5.8. As established in Example 5.7, (5–9) and (5–10) are facet-defining

lifted bilinear cover inequalities (5–47) for both PB and PF that are obtained by choos-

ing (C,M, T ) = ({3, 4}, {1}, {2}) and (C,M, T ) = ({2, 4}, {1}, {3}) respectively in

Theorem 5.1. However, (5–9) and (5–10) cannot be obtained using the results of Propo-

sition 5.24 since in (5–75) at most one of the coefficients of xi and yi is nonzero for all

i ∈ N .

Other families of inequalities are known for PF . In particular, Gu et al. [61] studied

the general single-node flow set

G ={(x, y) ∈ {0, 1}n × [0, 1]n

∣∣∣∑

j∈N+

ajyj −∑

j∈N−ajyj ≤ d, xj ≥ yj ∀j ∈ N

},

where N = N+ ∪ N− and n = |N |. The flow set F we study can be obtained from G

by restricting the set of inflows to be empty, i.e., setting N+ = ∅ and d = −d. Using

sequence-independent lifting techniques, the authors derived two families of strong valid

inequalities for G. Among them, only one applies to F . We investigate this family of lifted

159

simple generalized inequalities (LSGFCI) next. We first show in Proposition 5.27 that

LSGFCIs are facet-defining for PF under an additional assumption. We then derive these

inequalities from lifted reverse bilinear cover inequalities.

We now briefly review the work of Gu et al. [61]. For the general single-node flow

set G, a set C = C+ ∪ C− is called a generalized cover if C+ ⊆ N+, C− ⊆ N−, and∑

j∈C+ aj −∑

j∈C− aj = d + λ with λ > 0; see Van Roy and Wolsey [127] and Nemhauser

and Wolsey [91]. For the special case where N+ = ∅ in G, a generalized cover of F is

defined as C ⊆ N such that∑

j∈C aj = d − λ with λ > 0. Given a generalized cover C for

F , we obtain the following single-node flow model

F 0 =

(x, y) ∈ {0, 1}n × [0, 1]n

∣∣∣∑

j∈N\Cajyj ≥ d−

∑j∈C

aj = λ, xj ≥ yj ∀j ∈ N \ C ,

by fixing the variables (xj, yj) to (1, 1) for all j ∈ C. This set is denoted by X0 in Gu

et al. [61]. Note that F 0 is full-dimensional since F is assumed to be full-dimensional, i.e.,∑

j∈N\C aj =∑

j∈N aj − d + λ > ai + λ for all i ∈ N . The simple generalized flow cover

inequality (SGFCI)∑j∈L

λxj +∑

j∈N\(C∪L)ajyj ≥ λ (5–79)

is not always facet-defining for conv(X0) where L = {j ∈ N \ C | aj > λ}. InProposition 5.26 below, we prove that (5–79) is facet-defining for PF under the condition

that∑

j∈N\L aj > d.

Proposition 5.26. The simple generalized flow cover inequality (SGFCI)

∑j∈L

dxj +∑

j∈N\Lajyj ≥ d (5–80)

is valid for PF where L = {j ∈ N | aj > d}. Further, (5–80) is facet-defining for PF if∑

j∈N\L aj > d.

Proof. Validity follows from Corollary 4 in Van Roy and Wolsey [127]. We prove that

(5–80) is facet-defining if∑

j∈N\L aj > d. Consider first the case where L = ∅. Then,

160

the result is obvious since (5–80) is one of the trivial facets of PF . If L 6= ∅, then we

show that (5–80) is facet-defining for PF using the same arguments as in the proof of

Proposition 5.9. In particular, consider the 2n points pl, pl for all l ∈ L and qk, qk for all

k ∈ N \ L used in the proof of Proposition 5.9. It can be easily verified that these points

belong to F . It follows that (5–80) is facet-defining for PF .

Proposition 5.27 (Extended from Theorem 12 in Gu et al. [61]). Assume that (i) C ⊆ N

is a generalized cover for F such that∑

j∈C aj = d − λ with λ > 0 and (ii) L 6= ∅ and∑

j∈N\L aj > d where L = {j ∈ N \ C | aj > λ}. Assume also that L = {j1, j2, . . . , jr} with

aji ≥ aji+1for i = 1, . . . , r − 1. Let r = |L|, A0 = 0, and Ai =

∑ik=1 ajk for i = 1, . . . , r.

Further, let d′ =∑

j∈N\C aj − λ. Define

f(z) =

iλ if Ai ≤ z ≤ Ai+1 − λ, i = 0, . . . , r − 1,

z − Ai + iλ if Ai − λ ≤ z ≤ Ai, i = 1, . . . , r − 1,

z − Ar + rλ if Ar − λ ≤ z ≤ d′.

(5–81)

Then, the lifted simple generalized flow cover inequality (LSGFCI)

∑j∈L

λxj +∑j∈C

f(aj)xj +∑

j∈N\(C∪L)ajyj ≥ λ+

∑j∈C

f(aj) (5–82)

is facet-defining for PF .

Proof. Given a generalized cover C for F , after fixing (xj, yj) = (1, 1) for all j ∈ C,we obtain from Proposition 5.26 that (5–79) is facet-defining for conv(F 0). To lift the

variables (xi, yi) for i ∈ C, we compute the lifting function f(z) associated with (5–79) as

f(z) = min −λ+

∑

j∈Lλxj +

∑

j∈N\(C∪L)ajyj

s.t.∑

j∈N\Cajyj ≥ λ+ z

yj ≤ xj, ∀j ∈ N \ C,

xj ∈ {0, 1}, yj ∈ [0, 1], ∀j ∈ N \ C.

161

Since L 6= ∅, it follows from Theorem 10 in Gu et al. [61] that the lifting function f(z)

is given by (5–81). Further, f(z) is superadditive over R+ since the condition (1) of

Corollary 2 in Gu et al. [61] holds. Therefore, we conclude from Proposition 5.13 that

(5–82) is facet-defining for PF .

We show next that inequalities of the form (5–82) for PF are lifted reverse bilinear

cover inequalities for PB. The proof uses the observation that a cover C of the bilinear set

B can be obtained from a generalized cover C of the flow set F by adding one element l in

L, i.e., C = C ∪ {l} where l ∈ L.Proposition 5.28. Inequality (5–82) for PF can be obtained as a lifted reverse bilinear

cover inequality (5–59) of PB.

Proof. For a given generalized cover C of F , we define a cover C of B as C = C ∪ {l}for l ∈ L since L 6= ∅ and

∑j∈C aj = d + µ > d where µ = al − λ > 0. The cover

C satisfies Assumption (i) in Theorem 5.2 because al − µ = λ > 0 for l ∈ C. Further,

since L ⊆ M , we can set M = L \ {l} in (5–59). Assumption (ii) also holds since∑

j∈N\(L\{l}) aj − d =∑

j∈N\L aj + al − d > 0. Next, we observe that C ∪M = C ∪ L and

that min{ai, al−µ} = al−µ = λ for all i ∈ M . Substituting al−µ = λ in Proposition 5.19,

we obtain that PM(w) = g(w) since M ∪ {l} = L. Therefore, we conclude that (5–82) can

be obtained as a lifted reverse bilinear cover inequality (5–59).


In this chapter, we studied the polyhedral structure of the 0−1 mixed-integer bilinear

covering set. We described the convex hull of this set when n = 2. We also presented

three families of lifted inequalities obtained using sequence-independent lifting. Among

them, two families have an exponential number of members. We also studied the relations

between 0−1 mixed-integer bilinear covering sets and single-node flow sets without inflows.

In particular, we showed that inequalities for bilinear sets are valid for flow sets and we

proved that all nontrivial facets of PF can be obtained through the study of PB. We then

162

showed that the inequalities we derived generalize classical families of lifted flow cover

inequalities for PF .

163

CHAPTER 6A COMPUTATIONAL STUDY OF LIFTED INEQUALITIES FOR 0-1

MIXED-INTEGER BILINEAR COVERING SETS 1

6.1 Introduction

In this chapter, we seek to evaluate the quality of the valid inequalities we derived in

Chapter 5 for problems that contain bilinear covering constraints of the form

n∑j=1

ajxjyj ≥ d (6–1)

where aj > 0, xj ∈ {0, 1}, and yj ∈ [0, 1] for j = 1, . . . , n. To this end, we will

consider several randomly generated instances that we will solve with branch-and-bound

with and without the addition of our cuts. In Section 6.2, we describe a set S that is a

generalization of the bilinear covering set B that includes additional linear terms. This

set appears naturally during the branch-and-bound process as we describe in Section 6.2.

We then show that two families of inequalities we derived in Chapter 5 have natural

counterparts for S. This extends the applicability of our results from the root node of the

branch-and-bound tree to any node inside of the tree. In Section 6.3, we describe a family

of randomly generated problems that we use to test the strength of our cuts. We then

present the result of a computational study on these instances. In particular, we compare

our results to those obtained when linearizing the bilinear terms. In Section 6.4, we give

concluding remarks.

6.2 Generalization to Bilinear Constraints with Linear Terms

In practice, there are two common ways of using cuts inside of a branch-and-bound

framework. In the first, cutting planes are only added at the root node of the tree. This

variant is often referred to as cut-and-branch. In the second, cutting planes are added

throughout the tree. The generic term branch-and-cut is usually reserved for this variant.


164

Note that, if we aim to design a cut-and-branch algorithm for problems containing

(6–1), the results of Chapter 5 are sufficient. However, if our goal is to apply cuts inside of

a branch-and-cut framework, then we must also investigate what becomes of the constraint

(6–1) inside of the tree.

Assume for example that we decide to branch on variable x1 at the root node. The

two branches created will now have the restrictions: (i) x1 = 0 and (ii) x1 = 1. In the

former case, (6–1) reduces ton∑

j=2

ajxjyj ≥ d,

which has the same form as (6–1). However, in the latter case, after branching, constraint

(6–1) will be of the formn∑

j=2

ajxjyj + a1y1 ≥ d. (6–2)

which does not follow the template set in (6–1).

Similarly, when branching on the continuous variable y1 at branching point ω ∈ (0, 1),

we obtain two branches where (i) y1 ≤ ω and (ii) y1 ≥ ω. In the former case, after

re-scaling the variable yj, we can write

ωa1x1y1 +n∑

j=2

ajxjyj ≥ d,

where y1 = y1ω

and 0 ≤ y1 ≤ 1. This constraint is of the form (6–1). When branching on

y1 ≥ ω, we introduce the new variable y1 =y1−ω1−ω

. Substituting in (6–1), we obtain

a1x1

((1− ω)y1 + ω

)+

n∑j=2

ajxjyj ≥ d,

i.e.,

(1− ω)a1x1y1 +n∑

j=2

ajxjyj + ωa1x1 ≥ d, (6–3)

where 0 ≤ y1 ≤ 1. This expression again does not conform with (6–1). This suggests that,

when using branch-and-cut, it is useful to consider the generalization of the constraint

165

(6–1) that is given by∑j∈J

(ajxjyj + bjxj) +∑j∈I

ajyj ≥ d (6–4)

where N = I ∪ J . We assume that J 6= ∅ but I might be empty. Note that (6–4) contains

linear terms in both x and y. The linear terms in yj are not associated with any bilinear

term xjyj as can be observed in (6–2). Linear terms in xj however always appear with a

corresponding bilinear term xjyj as can be observed in (6–3).

As a result, we consider in this section the set

S :=

{(x, y) ∈ {0, 1}n × [0, 1]n+m

∣∣∣∑j∈J


ajyj ≥ d

},

where J = {1, . . . , n}, I = {n + 1, . . . , n + m}, aj > 0, bj ≥ 0, and d > 0. Similar to

Chapter 5, we require

Assumption 6.1.∑

j∈J(aj + bj) +∑

j∈I aj ≥ d+ ai + bi for all i ∈ J .

We denote the convex hull of the set S as PS := conv(S). Assumption 6.1 guarantees

that PS is full-dimensional, i.e., dim(PS) = 2n + m. We now derive facet-defining

inequalities for PS using sequence-independent lifting. Proposition 6.1 describes the basic

constraint we use to generate the seed inequality of our lifting procedures.

Proposition 6.1. Assume that (i) a1 + b1 > d, (ii)∑

j∈J\{1}(aj + bj) +∑

j∈I aj > d, and

(iii)∑

j∈J\{1} bj < d. Then,

d−

∑

j∈J\{1}bj

x1 +

∑

j∈J\{1}ajyj +

∑j∈I

ajyj ≥ d−∑

j∈J\{1}bj (6–5)

is facet-defining for PS.

Proof. We first show that (6–5) is valid for S. Assume for a contradiction that there exists

(x′, y′) ∈ S such that

d−

∑

j∈J\{1}bj

x′

1 +∑

j∈J\{1}ajy

′j +

∑j∈I

ajy′j < d−

∑

j∈J\{1}bj.

166

Clearly, x′1 = 0. It follows that

d >∑

j∈J\{1}(ajy

′j+bj)+

∑j∈I

ajy′j ≥

∑

j∈J\{1}(ajx

′jy

′j+bjx

′j)+

∑j∈I

ajy′j =

∑j∈J

(ajx′jy

′j+bjx

′j)+

∑j∈I

ajy′j.

This is a contradiction to the fact that (x′, y′) ∈ S.

Next, we prove that (6–5) is facet-defining for PS by providing 2n + m points in S

satisfying (6–5) at equality such that the solutions (α, β, δ) to the system αxi + βyi = δ

for i = 1, . . . , 2n + m yield inequalities αx + βy ≥ δ that are scalar multiples of (6–5).

Consider the two points p1 = (e1, e1) and q1 = (e1, (1 − ε)e1) where ε > 0 is sufficiently

small. Clearly, p1 and q1 belong to S because of (i) and satisfy (6–5) at equality. From p1

and q1, we obtain that α1 + β1 = δ and α1 + (1 − ε)β1 = δ, which implies that α1 = δ and

β1 = 0. Next, we define the n − 1 points pk = (e1 + ek, e1) for k = 2, . . . , n. Finally, let

d =d−∑

j∈J\{1} bj∑j∈I∪J\{1} aj

. From assumptions (ii) and (iii), it is easily verified that 0 < d < 1. For

l = 2, . . . , n+m, we construct the n+m− 1 points

ql =

∑

j∈J\{1}ej, d

∑

j∈I∪J\{1}ej

when l = 2

and

ql =

∑

j∈J\{1}ej, d

∑

j∈I∪J\{1}ej + ε

(1

al−1

el−1 − 1

alel

) when l ≥ 3

where ε > 0 is sufficiently small. It can be verified that the points pk and ql belong to S

and satisfy (6–5) at equality. From pk for k = 2, . . . , n, we obtain that α1 + αk + β1 = δ,

which implies that αk = 0 for k = 2, . . . , n since α1 = δ and β1 = 0. Further, using the

points ql for l = 2, . . . , n+m, we obtain the system of equations:

∑

j∈J\{1}αj + d

∑

j∈I∪J\{1}βj = δ, (6–6)

∑

j∈J\{1}αj + d

∑

j∈I∪J\{1}βj + ε

(βl−1

al−1

− βl

al

)= δ, l = 3, . . . , n+m. (6–7)

By subtracting (6–6) from (6–7), we conclude that there exists θ such that βl−1

al−1= βl

al= θ

for l = 3, . . . , n+m. Substituting αk = 0 for k = 2, . . . , n and βl = θal for l = 2, . . . , n+m

167

in (6–6), we obtain that dθ∑

j∈I∪J\{1} aj = δ, i.e., θ = δd−∑

j∈J\{1} bj. It follows that

βl =δ

d−∑j∈J\{1} bj

al for all l = 2, . . . , n +m. Therefore, we conclude that α1 = δd−∑n

j=2 bjd,

β1 = 0, αk = 0 for k = 2, . . . , n, and βl =δ

d−∑nj=2 bj

al for l = 2, . . . , n + m, which proves

that (6–5) is facet-defining for PS.

Observe that, when I = ∅ and bj = 0 for j ∈ J , (6–5) is an inequality of the form

(5–23). We will use (6–5) to construct the seed inequality of our lifting procedure in a way

that is analogous to that we used in Chapter 5.

In the ensuing discussions, we use the following notation. For J0, J1 ⊆ J with

J0 ∩ J1 = ∅, J0, J1 ⊆ J with J0 ∩ J1 = ∅, and I1 ⊆ I, we define

S(J0, J1; J0, J1, I1) :=

(x, y) ∈ S

∣∣∣∣∣∣∣∣∣∣

xj = 0 for j ∈ J0, xj = 1 for j ∈ J1,

yj = 0 for j ∈ J0, yj = 1 for j ∈ J1,

yj = 1 for j ∈ I1

.

To build a seed inequality of the form (6–5), we again use the concept of a cover and

adapt it for the set S as follows.

Definition 6.1. We say that C ⊆ J is a cover for S if∑

j∈C(aj + bj) > d. Further, we

define the excess of C as µ =∑

j∈C(aj + bj)− d > 0.

To generate lifted inequalities for PS, we partition the set J into (C,M, T ) and the

set I into (I0, I1) so that the following assumptions are satisfied;

(A1) C is a cover for S with excess µ,

(A2) Al − µ >∑

j∈T bj +∑

j∈I1 aj where l ∈ argmax{aj + bj | j ∈ C} and Al = al + bl,

(A3)∑

j∈C∪T (aj + bj) +∑

j∈I aj > d+Al.

To derive lifted inequalities from (C,M, T ) and (I0, I1), we fix the variables (xj, yj)

for j ∈ M to (0, 0), the variables (xj, yj) for j ∈ C \ {l} to (1, 1), and the variables yj for

j ∈ I1 to 1. The resulting set S(M,C \ {l};M,C \ {l}, I1) is defined by the inequality

alxlyl + blxl +∑j∈T

(ajxjyj + bjxj) +∑j∈I0

ajyj ≥ d−∑

j∈C\{l}(aj + bj)−

∑j∈I1

aj = Al − µ−∑j∈I1

aj.

168

Note that the right-hand-side of this expression is nonnegative because of (A2). It follows

from Assumption (A3) that

∑j∈T

(aj + bj) +∑j∈I0

aj > d+Al −∑j∈C

(aj + bj)−∑j∈I1

aj = Al − µ−∑j∈I1

aj.

Using the result of Proposition 6.1 with Assumption (A2), we obtain that

(Al − µ)xl +∑j∈T

ajyj +∑j∈I0

ajyj ≥ Al − µ (6–8)

where µ = µ +∑

j∈T bj +∑

j∈I1 aj is facet-defining for PS(M,C \ {l};M,C \ {l}, I1).We now lift (6–8) to construct the seed inequality. We first reintroduce the continuous

variables yj for j ∈ I1 in (6–8). The lifting function corresponding to (6–8) is defined as:

L(w) := max (Al − µ)−{(Al − µ)xl +

∑j∈T

ajyj +∑j∈I0

ajyj

}

s.t. alxlyl + blxl +∑j∈T

(ajxjyj + bjxj) (6–9)

+∑j∈I0

ajyj ≥ Al − µ−∑j∈I1

aj − w

xj ∈ {0, 1} j ∈ {l} ∪ T, yj ∈ [0, 1] j ∈ {l} ∪ T ∪ I0.

Next, we derive a closed-form expression for the function L(w).

Proposition 6.2. The lifting function L(w) is given by

L(w) =

−∞ if w < −a− µ

w + µ if −a− µ ≤ w < −µ

0 if −µ ≤ w < 0

w if 0 ≤ w < Al − µ

Al − µ if Al − µ ≤ w,

where a =∑

j∈T aj +∑

j∈I0 aj and µ = µ+∑

j∈T bj +∑

j∈I1 aj.

Proof. Observe that there exists an optimal solution (x∗, y∗) that satisfies x∗j = 1 for j ∈ T

and y∗l = 1 since the coefficients of xj for j ∈ T and yl in the objective are zero. Hence,

169

L(w) can be rewritten as

L(w) = max (Al − µ)−{(Al − µ)xl +

∑j∈T

ajyj +∑j∈I0

ajyj

}

s.t. Alxl +∑j∈T

ajyj +∑j∈I0

ajyj ≥ Al − µ− w (6–10)

xl ∈ {0, 1}, yj ∈ [0, 1] j ∈ T ∪ I0.

Now, define y =∑

j∈T ajyj+∑

j∈I0ajyj

a. Since 0 ≤ y ≤ 1, we can further simplify L(w) as:

L(w) = max (Al − µ)− {(Al − µ)xl + ay}

s.t. Alxl + ay ≥ Al − µ− w

xl ∈ {0, 1}, y ∈ [0, 1].

This problem has the same structure as (5–37) in Proposition 5.14. Its optimal value can

be obtained similarly and yields the given expression for L(w).

Lemma 6.1. L(w) is subadditive over R− and R+ respectively.

Proof. By setting al = Al, µ = µ, and∑

j∈T aj = a in Proposition 5.14, we conclude that

L(w) is subadditive over R− and R+.

We next lift the variables yj for j ∈ I1 from 1. Lifting is simple to perform since L(w)

is subadditive over R−.

Proposition 6.3. Under Assumptions (A1), (A2), and (A3), the inequality


ajyj +∑j∈I0

ajyj ≥ Al − µ (6–11)

is facet-defining for PS(M,C \ {l};M,C \ {l}, ∅).

Proof. We know from Proposition 6.1 that (6–8) is facet-defining for PS(M,C \ {l};M,C \{l}, I1). Since L(w) is subadditive over R−, lifting coefficients βi for yi where i ∈ I1 are

valid if

βi(yi − 1) ≥ L(aiyi − ai) for 0 ≤ yi < 1. (6–12)

170

Since L(w) ≤ 0 for w ≤ 0, L(aiyi − ai) ≤ 0 for all yi ∈ [0, 1). Therefore, coefficients βi = 0

satisfy condition (6–12). Further, (6–12) is satisfied at equality at the point y∗i = ai−µai

.

Therefore, we conclude that (6–11) is facet-defining for PS(M,C \ {l};M,C \ {l}, ∅).

We next compute the lifting function associated with (6–11)

LI(w) := max (Al − µ)−{(Al − µ)xl +

∑j∈T

ajyj +∑j∈I0

ajyj

}

s.t. alxlyl + blxl +∑j∈T


ajyj ≥ Al − µ− w

xj ∈ {0, 1} j ∈ {l} ∪ T, yj ∈ [0, 1] j ∈ {l} ∪ T ∪ I.

Observe that this lifting function is very similar in structure to L(w) presented in

(6–9). Therefore, we can adapt the result of Proposition 6.2 as follows.

Lemma 6.2.

LI(w) =

−∞ if w < −a− µ

w + µ if −a− µ ≤ w < −µ

0 if −µ ≤ w < 0

w if 0 ≤ w < Al − µ

Al − µ if Al − µ ≤ w.

Proof. Observe that there exists an optimal solution (x∗, y∗) such that x∗j = 1 for j ∈ T ,

y∗l = 1, and yj = 1 for j ∈ I1 since the corresponding coefficients of xj for j ∈ T , yl, and yj

for j ∈ I1 in the objective are zero. Hence, LI(w) is rewritten as

LI(w) = max (Al − µ)−{(Al − µ)xl +

∑j∈T

ajyj +∑j∈I0

ajyj

}

s.t. Alxl +∑j∈T

ajyj +∑j∈I0

ajyj ≥ Al − µ− w (6–13)

xl ∈ {0, 1}, yj ∈ [0, 1] j ∈ T ∪ I0.

Note that this problem has the form of (6–10) and therefore, optimal solutions can be

obtained in the same way. The results follows.

171

Because the result of Lemma 6.2 is obtained similarly as that of Proposition 6.2, we

conclude from Lemma 6.1 that LI(w) is subadditive over R− and R+.

Lemma 6.3. LI(w) is subadditive over R− and R+ respectively.

We now obtain two different families of inequalities by reintroducing the variables

(xj, yj) for j ∈ M ∪ C \ {l} in different orders.

6.2.1 Generalized Lifted Bilinear Cover Inequalities

To obtain a generalized lifted bilinear cover inequality from the seed inequality (6–11),

we will lift the variables in C \ {l} before lifting the variables in M . Lifting the variables

(xi, yi) for i ∈ C \ {l} is simple since LI(w) is subadditive over R−.

Proposition 6.4. Under Assumptions (A1), (A2), and (A3),

∑j∈C

(aj + bj − µ)+xj +∑j∈T

ajyj +∑j∈I0

ajyj ≥∑j∈C

(aj + bj − µ)+ (6–14)

is facet-defining for PS(M, ∅;M, ∅, ∅).

Proof. We know from Proposition 6.3 that (6–11) is facet-defining for PS(M,C\{l};M,C\{l}, ∅). Since LI(w) is subadditive over R−, lifting coefficients (αi, βi) of variables (xi, yi)

for i ∈ C \ {l} are valid if they satisfy the condition

αi(xi − 1) + βi(yi − 1) ≥ LI((aixiyi + bixi)− (ai + bi)

)for (xi, yi) ∈ {0, 1} × [0, 1] \ {1, 1}.

(6–15)


βi ≤ inf0≤φ<1

−LI(aiφ− ai)

1− φ, (6–16)

αi + sup0≤φ≤1

βi(1− φ) ≤ −LI(−(ai + bi)

). (6–17)

Since L(w) ≤ 0 for w ≤ 0, condition (6–16) is satisfied when βi = 0. In addition, if we

choose αi = −LI(−(ai + bi)

), condition (6–17) is satisfied as

αi + sup0≤φ≤1

βi(1− φ) = αi = −LI(−(ai + bi)

).

172

We obtain that αi = (ai+bi− µ)+ using the expression of LI(w) given in Lemma 6.2. Since

(6–15) is satisfied at equality at the points (0, 0) and(1, ai−µ

ai

), (6–14) is facet-defining for

PS(M, ∅;M, ∅, ∅).

To obtain facet-defining inequalities for PS, we next lift the remaining variables

(xi, yi) for i ∈ M . The lifting function LC(w) corresponding to (6–14) is defined as

LC(w) := max∑j∈C

(aj + bj − µ)+ −{∑

j∈C(aj + bj − µ)+xj +

∑j∈T

ajyj +∑j∈I0

ajyj

}

s.t.∑

j∈C∪T(ajxjyj + bjxj) +

∑j∈I

ajyj ≥∑i∈C

(ai + bi)− µ− w (6–18)

xj ∈ {0, 1} j ∈ C ∪ T, yj ∈ [0, 1] j ∈ (C ∪ T ) ∪ I.

We now derive a closed-form expression for LC(w). We assume without loss of generality

that C = {1, . . . , p} and a1 + b1 ≥ a2 + b2 ≥ . . . ≥ ap + bp. Let q ∈ C be such that

aq + bq > µ ≥ aq+1 + bq+1. Further, define E0 = 0 and Ei =∑i

j=1(aj + bj) for all i ∈ C.

Note that Ep =∑p

j=1(aj + bj) = d+ µ.


LC(w) =

w − iµ if Ei ≤ w < Ei+1 − µ, i = 0, . . . , q − 1,

Ei − iµ if Ei − µ ≤ w < Ei, i = 1, . . . , q − 1,

Eq − qµ if Eq − µ ≤ w,

where µ = µ+∑

j∈T bj +∑

j∈I1 aj.

Proof. First, observe that there exists an optimal solution (x∗, y∗) of (6–18) in which

x∗j = 1 for j ∈ T and y∗j = 1 for j ∈ C ∪ I1 since the corresponding objective coefficients

are zero. Since aq + bq > µ ≥ aq+1 + bq+1 for q ∈ C, we have that (aj + bj − µ)+ = 0 for

j = q + 1, . . . , p, which also implies that there exists an optimal solution (x∗, y∗) in which

x∗j = 1 for j = q + 1, . . . , p. Therefore, using the same notation a =

∑j∈T aj +

∑j∈I0 aj and

173

y =∑

j∈T ajyj+∑

j∈I0ajyj

aas in Proposition 6.2, LC(w) can be restated as:

LC(w) = max

q∑j=1

(aj + bj − µ)−{

q∑j=1

(aj + bj − µ)xj + ay

}

s.t.

q∑j=1

(aj + bj)xj + ay ≥q∑

j=1

(aj + bj)− µ− w

xj ∈ {0, 1} j = 1, . . . , q, y ∈ [0, 1].

This lifting function has the form of (5–43). Therefore, the proof of Proposition 5.16 can

be followed to obtain the result.

Using the result of Corollary 5.1, we can easily verify that LC(w) is subadditive over

R+. The result of Proposition 6.5 is illustrated on the following example.

Example 6.1. Consider the set S defined as

(9y1 + 12)x1 + (14y2 + 5)x2 + (15y3 + 2)x3 + (10y4 + 5)x4 + (7y5 + 3)x5 + 6y6 + y7 ≥ 23.

Assume that partitions of J and I are given as (C,M, T ) = ({4, 5}, {1, 2}, {3}) and(I0, I1) = ({6}, {7}) respectively. It can be easily verified that these partitions satisfy

Assumptions (A1)-(A3) since C is a cover with µ = 2, Al − µ = 15− 2 > 2 + 1 = b3 + a7,

and∑

j∈C∪T (aj + bj)+∑

j∈I aj = (10+5+7+3+15+2)+(6+1) = 51 > 23+15 = d+Al.

We obtain from Proposition 6.4 that the inequality

15y3 + 10x4 + 5x5 + 6y6 ≥ 15 (6–19)

is facet-defining for PS(M, ∅;M, ∅, ∅). Using the result of Proposition 6.5, the lifting

function LC(w) is computed as

LC(w) =

w if 0 ≤ w < 15− 5 = 10,

10 if 10 ≤ w < 15,

w − 5 if 15 ≤ w < 15 + 10− 5 = 20,

20− 5 if 20 ≤ w.

174

Function LC(w) is represented in Figure 6-1.

0 5 10 15 200

2

4

6

8

10

12

14

16LC(w)

E1 − µ E1 E2 − µ

w

Figure 6-1. Lifting function LC(w) of (6–19)

Next, we compute the lifting coefficients of variables (xi, yi) for i ∈ M using LC(w).

Similar to the derivation of lifted bilinear cover inequalities (5–47), lifting coefficients

(αi, βi) for i ∈ M must be chosen to satisfy

αixi + βiyi ≥ LC(aixiyi + bixi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (6–20)

For the variables (x1, y1) of Example 6.1, LC(aixiyi + bixi) is represented in Figure 6-2

(a). Observe from Condition (6–20) that lifting coefficients (α1, β1) must be chosen in

such a way that the plane αixi + βiyi overestimates LC(aixiyi + bixi). Using a geometric

interpretation similar to that used in Lemma 5.1, we obtain several possible overestimating

planes as shown in Figure 6-2 (b).

To obtain an overestimating plane αixi+βiyi, we next describe a concave overestimator

of LC(w) over [bi, ai + bi].

Lemma 6.4. Assume that ai > 0 and bi ≥ 0. Define qi and ri as

qi :=

0 if 0 ≤ bi < E1 − µ,

i if Ei − µ ≤ bi ≤ Ei+1 − µ, i = 1, . . . , q − 1,

q if Eq − µ < bi.

175

0

0.5

1

0

0.5

10

5

10

15

x

LC(9y + 12)

y 0

0.5

1

0

0.5

10

5

10

15

x

LC(9y + 12)

y

(a) (b)


and

ri :=

0 if 0 ≤ ai + bi < E1 − µ,

i if Ei − µ ≤ ai + bi ≤ Ei+1 − µ, i = 1, . . . , q − 1,

q if Eq − µ < ai + bi.

Let Qi0 = bi, Q

ij = Eqi+j − µ for j = 1, . . . , ri − qi and Qi

ri−qi+1 = ai + bi. Further, let

δij = Qij −Qi

j−1 for j = 1, . . . , ri − qi + 1. Define

pij(w) = LC(Qij) +

LC(Qij+1)− LC(Qi

j)

δij+1

(w −Qij)

for j = 0, . . . , ri − qi. Then, the function

p(w) := min{pij(w)

∣∣∣ j ∈ {0, . . . , ri − qi}}

(6–21)

is a concave overestimator of LC(w) over [bi, ai + bi].

176

Theorem 6.1. Under Assumptions (A1), (A2), and (A3), a generalized lifted bilinear

cover inequality

∑j∈C

(aj + bj − µ)+xj +∑j∈T

ajyj +∑j∈I0

ajyj

+∑i∈M

αixi +∑i∈M

βiyi ≥∑j∈C

(aj + bj − µ)+ (6–22)

is facet-defining for PS if

(αi, βi) ∈

(LC(ai + bi), 0

)⋃ (

LC(bi), LC(ai + bi)− LC(bi)

)if qi = ri

⋃ri−qij=0

(LC(Qi

j)−LC(Qi

j+1)−LC(Qij)

δij+1(Qi

j − bi),LC(Qi

j+1)−LC(Qij)

δij+1ai

)if qi < ri

for i ∈ M in (6–22) where ri, Qij and δij+1 are as defined in Lemma 6.4.

Proof. By subadditivity of LC(w) for w ≥ 0, the lifting coefficients (αi, βi) of (xi, yi) for

i ∈ M will be valid if they satisfy the condition

αixi + βiyi ≥ LC(aixiyi + bixi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (6–23)


βiφ ≥ LC(0) for 0 < φ ≤ 1, (6–24)

αi + βiφ ≥ LC(aiφ+ bi) for 0 ≤ φ ≤ 1. (6–25)

To prove that the lifted inequality (6–22) is facet-defining, we show that the proposed

coefficients (αi, βi) satisfy (6–24) and (6–25) and describe two points (xi, yi) for which

(6–23) is satisfied at equality. First, consider the case where (αi, βi) = (LC(ai + bi), 0).

Condition (6–24) is satisfied since βi = 0 and LC(0) = 0. Condition (6–25) also holds

because αi = LC(ai + bi) and LC(w) is non-decreasing. Further, (6–23) is satisfied at

equality at the two points, (0, φ) for some 0 < φ < 1 and (1, 1). Second, consider the case

where (αi, βi) = (LC(bi), LC(ai + bi)− LC(bi)) with qi = ri. Since LC(ai + bi) ≥ LC(bi) and

LC(w) is nondecreasing, we have that βi ≥ 0. Further, since LC(0) = 0, (6–24) is satisfied.

177

Using

LC(aiφ+ bi) ≤ LC(bi) +LC(ai + bi)− LC(bi)

ai(aiφ+ bi − bi) = αi + βiφ,

it is clear that (6–25) holds. It can be easily verified that (6–23) is satisfied at equality at

the two points (1, 0) and (1, 1). Finally, consider the case

(αi, βi) =

(LC(Qi

j)−LC(Qi

j+1)− LC(Qij)

δij+1

(Qij − bi),

LC(Qij+1)− LC(Qi

j)

δij+1

ai

)

for qi < ri. Clearly, (αi, βi) satisfies (6–24) since βi ≥ 0. From Lemma 6.4, we have that

LC(aiφ+ bi) ≤ LC(Qij) +

LC(Qij+1)−LC(Qi

j)

δij+1

(aiφ+ bi −Qi

j

)

=(LC(Qi

j)−LC(Qi

j+1)−LC(Qij)

δij+1(Qi

j − bi))+

LC(Qij+1)−LC(Qi

j)

δij+1aiφ

= αi + βiφ.

We can also verify that (6–25) is satisfied at equality at the two points(1,

Qij−bi

ai

)and

(1,

Qij+1−bi

ai

). Therefore, we conclude that (6–22) is facet-defining for PS.

Note that the family of generalized lifted bilinear cover inequalities (6–22) has an

exponential number of members similar to lifted bilinear cover inequalities (5–47). This is

illustrated in Example 6.2.

Example 6.2. In Example 6.1, we obtained that (6–19) is facet-defining for PS(M, ∅;M, ∅, ∅).Applying Theorem 6.1, we obtain the six inequalities

15x1

10x1 +458y1

+

14x2

5x2 +14y2

709x2 +56

9y2

+ 15y3 + 10x4 + 5x5 + 6y6 ≥ 15

which are all facet-defining for PS. We obtain from Figure 6-2 that there are two choices

for the lifting coefficients of (x1, y1). Similarly, we can determine that there are three

choices for (x2, y2).

178

6.2.2 Generalized Lifted Reverse Bilinear Cover Inequalities

In Section 6.2.1, we obtained a generalized lifted bilinear cover inequality by first

lifting the variables in C \ {l} and then lifting the remaining variables in M inside of

(6–11). In this section, we derive another family of lifted inequalities by changing the

lifting order. We lift (6–11) with respect to the variables (xj, yj) for j ∈ M first before

the variables (xj, yj) for j ∈ C \ {l}. Among the assumptions concerning the partition

(C,M, T ), (A2) can be changed into

(A2’) Al − µ >∑

j∈T bj +∑

j∈I1 aj for some l ∈ C where Al = al + bl.

This is less stringent than (A2) since l can be chosen to be any element in C

satisfying Al − µ >∑

j∈T bj +∑

j∈I1 aj and not only the largest one.

Proposition 6.6. Under Assumptions (A1), (A2’), and (A3),

(Al − µ)xl +∑j∈M

min{aj + bj,Al − µ}xj +∑j∈T

ajyj +∑j∈I0

ajyj ≥ Al − µ (6–26)

is facet-defining for PS(∅, C \ {l}; ∅, C \ {l}, ∅).

Proof. From Proposition 6.3, we know that


ajyj +∑j∈I0

ajyj ≥ Al − µ

is facet-defining for PS(M,C \ {l};M,C \ {l}, ∅). Since LI(w) is subadditive over R+ as

shown in Lemma 6.3, lifting coefficients (αi, βi) of variables (xi, yi) for i ∈ M are valid if

they satisfy the condition:

αixi + βiyi ≥ LI(aixiyi + bixi) for (xi, yi) ∈ {0, 1} × [0, 1] \ {0, 0}. (6–27)


βiφ ≥ LI(0) for 0 < φ ≤ 1, (6–28)

αi + βiφ ≥ LI(aiφ+ bi) for 0 ≤ φ ≤ 1. (6–29)

179

We now show that (αi, βi) = (min{ai + bi,Al − µ}, 0) are valid lifting coefficients for the

variables (xi, yi). Since LI(0) = 0, (6–28) is trivially satisfied as βi = 0. Further, since

LI(aiφ + bi) = min{aiφ + bi,Al − µ} ≤ min{ai + bi,Al − µ} = αi, (6–29) also holds.

To prove that (6–26) is facet-defining, consider the two points (1, 1) and (0, φ∗) for any

0 < φ∗ < 1. These points satisfy (6–27) at equality, and therefore, we conclude that (6–26)

is facet-defining for PS(∅, C \ {l}; ∅, C \ {l}, ∅).

To obtain facet-defining inequalities for PS, we next reintroduce the remaining

variables (xj, yj) for j ∈ C \ {l} in (6–26). To this end, we derive a closed form expression

for the function

LM(w) := max

{(Al − µ)(xl − 1) +

∑j∈M


ajyj +∑j∈I0

ajyj

}

s.t. alxlyl + blxl +∑

j∈M∪T(ajxjyj + bjxj) +

∑j∈I

ajyj ≥ Al − µ+ w

xj ∈ {0, 1} j ∈ {l} ∪ J \ C, yj ∈ [0, 1] j ∈ {l} ∪ J \ C ∪ I.

Let M = M1 ∪M2 where M1 = {i ∈ M | ai + bi > Al − µ} and M2 = M \M1. Assume

without loss of generality that {l} ∪M1 = {1, . . . , q} and a1 + b1 ≥ a2 + b2 ≥ . . . ≥ aq + bq

where q = |M1| + 1. Further, define E0 = 0 and Ei =∑i

j=1(aj + bj) for all i = 1, . . . , q.

Note that Al +∑

j∈M∪T (aj + bj) = Eq +∑

j∈M2(aj + bj) +

∑j∈T (aj + bj).

Proposition 6.7. LM(w) is given by:

LM(w) =

−Al + µ if w < −Al + µ

w − Ei + i(Al − µ) if Ei −Al + µ ≤ w < Ei,

i(Al − µ) if Ei ≤ w < Ei+1 −Al + µ,

w − Eq + q(Al − µ) if Eq −Al + µ ≤ w ≤ ∑j∈J(aj + bj) +

∑j∈I aj − d

∞ if∑

j∈J(aj + bj) +∑

j∈I aj − d < w,

for i = 0, . . . , q − 1 where µ = µ+∑

j∈T bj +∑

j∈I1 aj.

180

Proof. First, we observe that there exists an optimal solution (x∗, y∗) in which x∗j = 1 for

j ∈ T and y∗j = 1 for j ∈ {l} ∪ M ∪ I1 since the corresponding objective coefficients are

zero. Using the same notation a =∑

j∈T aj +∑

j∈I0 aj and y =∑

j∈T ajyj+∑

j∈I0ajyj

aas in the

proof of Proposition 5.19 as well as the definition of M1, we can simplify LM(w) as:

LM(w) = min

∑

j∈{l}∪M1

(Al − µ)xj +∑j∈M2

(aj + bj)xj + ay

− (Al − µ)

s.t.∑

i∈{l}∪M1

(aj + bj)xj +∑j∈M2

(aj + bj)xj + ay ≥ Al − µ+ w

xj ∈ {0, 1} j ∈ {l} ∪M1 ∪M2, y ∈ [0, 1].

Note that a =∑

j∈T aj +∑

j∈I0 aj > Al − µ ≥ ai + bi for all i ∈ M2 because of

Assumption (A3) and the definition of M2. Further, by introducing the additional notation

a =∑

j∈M2(aj + bj) + a and y =

∑j∈M2

(aj+bj)xj+ay

a, LM(w) can be written as:

LM(w) = min

∑

j∈{l}∪M1

(Al − µ)xj + ay

− (Al − µ)

s.t.∑

j∈{l}∪M1

(aj + bj)xj + ay ≥ Al − µ+ w

xj ∈ {0, 1} j ∈ {l} ∪M1, y ∈ [0, 1].

Since this function has the same structure as (5–57), its optimal value can be computed

using the same technique. The result then follows from the proof of Proposition 5.19.

It can be verified from Proposition 5.20 that LM(w) is superadditive.

Lemma 6.5. The function LM(w) is superadditive over [0,∑

j∈J(aj + bj) +∑

j∈I aj − d].

Therefore, we can use sequence-independent lifting technique for the remaining

variables (xi, yi) for i ∈ C \ {l}.Theorem 6.2. Under Assumptions (A1), (A2’), and (A3), the generalized lifted reverse

bilinear cover inequality

(Al − µ)xl +∑

i∈C\{l}LM(ai + bi)xi (6–30)

181

+∑j∈M


ajyj +∑j∈I0

ajyj ≥ Al − µ+∑

i∈C\{l}LM(ai + bi)

is facet-defining for PS.

Proof. Since LM(w) is superadditive over w ∈ [0,∑

j∈J(aj + bj) +∑

j∈I aj − d], valid lifting

coefficients (αi, βi) for variables (xi, yi) where i ∈ C \ {l} can be obtained if they satisfy

the condition

αi(1− xi) + βi(1− yi) ≤ LM((ai + bi)− (aixiyi + bixi)) for (xi, yi) ∈ {0, 1} × [0, 1] \ {1, 1}.(6–31)


βi ≤ inf0≤φ<1

LM(ai − aiφ)

1− φ(6–32)

αi + sup0≤φ≤1

βi(1− φ) ≤ LM(ai + bi) (6–33)

Since PS is assumed to be full-dimensional, we obtain from Assumption 6.1 that ai + bi ≤∑

j∈J(aj + bj) +∑

j∈I aj − d for all i ∈ C. We next verify that (LM(ai + bi), 0) are

valid lifting coefficients for (xi, yi) where i ∈ C \ {l}. Clearly, βi = 0 satisfies (6–32)

since LM(w) ≥ 0 for w ≥ 0. Further, if αi = LM(ai + bi), then (6–33) is satisfied since

αi + sup0≤φ≤1 βi(1− φ) = αi = LM(ai + bi). Finally, since (6–31) is tight for the two points

(0, 0) and(1, (ai−A1+al−µ)+

ai

), we conclude that (6–30) is facet-defining for PS.

6.3 Preliminary Computational Study

We now perform a preliminary computational study to evaluate the strength of

lifted inequalities developed in Chapter 5 inside of a branch-and-cut algorithm. In

Section 6.3.1, we describe the testing environments including software and hardware

configurations. In Section 6.3.2, we describe testing instances on which we carry out the

empirical study. We then present implementation details about separation procedures and

performance measures in Section 6.3.3. We finally report numerical results that show our

lifted inequalities can help solve families of bilinear problems faster in Section 6.3.4.

182

6.3.1 Computational Environments

We implement a branch-and-cut algorithm using CPLEX [40] 11.1. CPLEX is

one of the most widely used commercial MIP solvers. It provides callable libraries that

allow users to customize cut generation inside of the branch-and-bound tree. All the

computational tests are carried out on the server iseunix.ise.ufl.edu that is running Redhat

Linux version 5 on Dell power edge 2600 with two Pentium4 3.2Ghz, 1M cache processors

and 6 gigabytes of memory.

6.3.2 Testing Instances

As a preliminary computational study, we evaluate how our lifted inequalities perform

on problems containing mostly bilinear covering constraints. In particular, we randomly

generate MINLP problems that minimize a linear objective subject to independent

bilinear covering constraints. These constraints are coupled with cardinality constraints.

As a result of their construction, these instances are sparse. More precisely, the testing

instances are formulated as

min∑i∈M

∑j∈N

fijxij +∑i∈M

∑j∈N

gijyij

s.t.∑j∈N

aijxijyij ≥ di, i ∈ M,

(B)∑i∈M

xij ≤ cj, j ∈ N,

xij ∈ {0, 1}, i ∈ M, j ∈ N,

yij ∈ [0, 1], i ∈ M, j ∈ N,

where M := {1, . . . ,m} and N := {1, . . . , n}. The parameters aij for all i ∈ M and j ∈ N

are nonnegative integers randomly generated from uniform distributions. Parameters di for

i ∈ M are created by multiplying∑

j∈N aij by a random number from the interval [l, u]

where 0 < l < u < 1 as presented in Table 6-1. Parameters cj for j ∈ N are similarly

chosen as multiples of |M | with random numbers in the interval [l, u]. Infeasible instances

183

are discarded. The coefficients fij and gij of the objective are chosen to be proportional to

aij.

We create three sets of instances by changing the size of the sets M and N as well as

the ranges of the parameters. For each parameter setting, we generate 10 instances. We

specify the parameters used in the generation of random instances in Table 6-1.

Table 6-1. Parameters of the random instances for three test sets

TestSetID Size of sets Range of the parameters

|M | |N | aij di/∑

j∈N aij cj/|M | fij/aij gij/aij

I-20-10 20 10 [5, 100] [0.3, 0.99] [0.3, 0.7] [0.5, 3.0] [0.7, 1.3]

I-50-15 50 15 [10, 70] [0.3, 0.99] [0.3, 0.75] [0.5, 3.0] [0.7, 1.3]

I-100-20 100 20 [10, 70] [0.4, 0.95] [0.3, 0.8] [0.5, 3.0] [0.7, 1.3]

For these instances, we compare the results obtained by first linearizing the bilinear

constraints and then solving the problem with CPLEX using default MIP cuts with the

results obtained by adding our cuts to the linearization and then calling CPLEX. In

particular, the linearization of problem (B) is derived by introducing auxiliary variables

wij to represent the products xijyij. The resulting linearization is:

min∑i∈M

∑j∈N

fijxij +∑i∈M

∑j∈N

gijyij

s.t.∑j∈N

aijwij ≥ di, i ∈ M,

∑i∈M

xij ≤ cj, j ∈ N,

xij − wij ≥ 0, i ∈ M, j ∈ N,

(LB) yij − wij ≥ 0, i ∈ M, j ∈ N,

xij + yij − wij ≤ 1, i ∈ M, j ∈ N,

xij ∈ {0, 1}, i ∈ M, j ∈ N,

yij ∈ [0, 1], i ∈ M, j ∈ N,

wij ≥ 0, i ∈ M, j ∈ N.

184

Table 6-2. Characteristics of the three test sets

TestSetID NVar NConst NZ Obj. Value

Bin Cont Bil Card MILP LP Gap(%)

I-20-10 200 400 20 10 1800 12242.44 11777.84 3.77

I-50-15 800 1600 50 15 6750 46823.60 46188.26 1.35

I-100-20 2000 4000 100 20 18000 140230.71 139019.64 0.86

In Table 6-2, we summarize the characteristics of (LB). Columns Bin and Cont

contain the number of binary and continuous variables in (LB), while the columns Bil

and Card give the number of bilinear and cardinality constraints respectively. Column NZ

shows the number of nonzero elements in the formulation. Column MILP presents the

average of their IP optimal values over the 10 instances corresponding to each parameter

setting. Column LP gives the average optimal values of the LP relaxations of (LB). The

gap between these two values is presented in Column Gap where

Gap(%) =MILP − LP

MILP∗ 100.

This measure describes how close the LP relaxation is to the optimal solution. The

optimal values for each individual instance are presented in Table 6-3.

6.3.3 Separation Procedures

As a preliminary testing, we implement a cut-and-branch algorithm that adds cuts

only at the root node. Among three families of lifted inequalities that we developed in

Chapter 5, only lifted bilinear covering inequalities (5–47) are used as cuts. Now we

describe separation procedures for the lifted bilinear cover inequalities (5–47). We first

observe that, in bilinear cover inequalities

∑j∈C


ajyj +∑i∈M

(αixi + βiyi) ≥∑j∈C

(aj − µ)+, (6–34)

185

Table 6-3. Objective values to the test instances

Instance ID MILP LP Gap(%)

I-20-10-0 12110.9195 11640.4189 3.8849

I-20-10-1 10727.5900 10434.0135 2.7366

I-20-10-2 11443.7508 11040.3807 3.5248

I-20-10-3 11714.3205 11471.0029 2.0771

I-20-10-4 12987.9883 12370.2148 4.7565

I-20-10-5 15076.8320 14610.3839 3.0938

I-20-10-6 10039.8013 9613.3069 4.2480

I-20-10-7 11516.0063 11114.4414 3.4870

I-20-10-8 15278.2189 14510.5942 5.0243

I-20-10-9 11529.0013 10973.6524 4.8170

I-50-15-0 49114.5050 48345.9494 1.5648

I-50-15-1 43482.4190 42981.5691 1.1518

I-50-15-2 43201.3765 42661.1454 1.2505

I-50-15-3 44608.6242 44041.2446 1.2719

I-50-15-4 47805.6008 47088.4489 1.5001

I-50-15-5 46348.5055 45715.3331 1.3661

I-50-15-6 46776.2481 45870.5232 1.9363

I-50-15-7 50821.3106 50206.1345 1.2105

I-50-15-8 49815.5893 49087.5678 1.4614

I-50-15-9 46261.8475 45884.6801 0.8153

I-100-20-0 137653.9675 136534.7575 0.8131

I-100-20-1 134121.9302 133153.2319 0.7223

I-100-20-2 139677.9045 138565.7847 0.7962

I-100-20-3 130580.6226 129397.9127 0.9057

I-100-20-4 136384.8079 135383.4821 0.7342

I-100-20-5 144012.8562 142557.8922 1.0103

I-100-20-6 138801.7630 137594.3403 0.8699

I-100-20-7 148099.1275 146786.2270 0.8865

I-100-20-8 148704.6629 147200.3903 1.0116

I-100-20-9 144269.4161 143022.4053 0.8644

186

the coefficients (αi, βi) are very dependent on the partition (C,M, T ). Therefore, it is

difficult to express the problem of finding a most violated lifted bilinear cover inequality as

a simple optimization problem.

We observe however that separating the inequality

∑j∈C


ajyj ≥∑j∈C

(aj − µ)+ (6–35)

which can be lifted to (6–34) is easier. In particular, we can write the separation problem

for (6–35) as

min∑j∈N

(aj − µ)+(x∗j − 1)ξj +

∑j∈N

ajy∗j ζj

s.t.∑j∈N

ajξj = d+ µ, (6–36)

ai ≥ µξi + ε, ∀i ∈ N, (6–37)

∑j∈N

ajζj ≥ aiξi − µ+ ε, ∀i ∈ N,

ξj + ζj ≤ 1, ∀j ∈ N,

ξj, ζj ∈ {0, 1}, ∀j ∈ N,

where ε > 0 is sufficiently small constant and ξj are defined as

ξj =

1 if j ∈ C for j ∈ N

0 otherwise,

and

ζj =

1 if j ∈ T for j ∈ N

0 otherwise.

This optimization problem is similar to a 0−1 knapsack problem. Although the 0−1

knapsack problem is NP-hard, it can often be solved efficiently with dynamic programming

(DP) techniques. However, for separation purposes, using a DP is usually out of the

187

question since data in MILPs is often fractional and leads to large DP tables. Therefore,

we will solve (6–37) using a heuristic approach, as is common in MIP.

Here, we adapt the coefficient independent cover generation scheme proposed by Gu

et al. [60] since it is known to be computationally efficient in the practical separation

of 0−1 cover inequalities. The basic idea is to select first the variables that have the

lowest LP values to include in the cover C. Assume that a fractional solution x∗ to

the LP relaxation satisfies x∗l1

≤ . . . ≤ x∗ln. Then, a violated cover C is obtained as

C := {l1, . . . , lc} where c = argmin{k | ∑kj=1 alj > d}.

After an initial cover C has been obtained using the heuristic described above, we

check if C satisfies Assumption (A2) in Section 5.3.2. If this does not hold, then we swap

an element in the cover with the one chosen from N \ C in the order of the lowest LP

values until (A2) is satisfied. We next determine the sets T and M . To this end, we

compute all the possible coefficients (αi, βi) of a variable pair (xi, yi) using the results of

Theorem 5.1. When there are many choices for (αi, βi), we select the values (α∗i , β

∗i ) that

lead to the largest violation for the current LP solution, i.e.,

α∗ix

∗i+β∗

i y∗i = min

αix

∗i + βiy

∗i

∣∣∣∣∣∣∣

(αi, βi) ∈(PC(ai), 0

)⋃qi

j=1

(PC(Qi

j)−PC(Qi

j+1)−PC(Qij)

aj+1Qi

j,PC(Qi

j+1)−PC(Qij)

aj+1ai

)

.

Then, we partition N \ C into (M,T ) using a simple comparison rule similar to that used

in Gu et al. [61], i.e.,

T ={i ∈ N \ C

∣∣∣ aiy∗i ≤ (α∗ix

∗i + β∗

i y∗i )}.

After obtaining an initial partition (C,M, T ) with this procedure, we check if

(C,M, T ) satisfies Assumption (A3) in Section 5.3.2. If not, we select an element from

the set M and add it to the set T until (A3) is satisfied.

188

6.3.4 Numerical Results

In this section, we present computational results obtained using lifted bilinear cover

inequalities (5–47) inside of a cut-and-branch algorithm. We compare the performances of

lifted cuts on three families of instances described in Table 6-1. In our implementation, a

single round of cuts is added at the root node.

In Tables 6-4, 6-5, and 6-6, we present preliminary results obtained on our randomly

generated instances. Columns MILP and Nodes show the optimal values and the number

of nodes in the tree when solving MILP problem using CPLEX with the default setting.

Column LP shows the optimal values of the LP relaxation at the root node. Column Cuts

refers to the number of cuts added at the root node. Columns LPCuts and CNodes refer

to the optimal LP values and the number of nodes in the tree after adding cuts to the

formulation. Column Gap Imp. computed as

Gap Imp.(%) =LPCuts− LP

MILP − LP∗ 100

presents how much added cuts help improve the bound and Column Node Red. computed

as

Node Red.(%) =Nodes− CNodes

Nodes∗ 100

describes how many nodes in the tree are decreased after adding cuts. We observe that

lifted cuts typically help improve the bounds of the LP relaxation at the root node

although the gap improvement is modest on some instances. Further, we observe that the

average number of nodes in the tree is reduced when adding lifted cuts.


In this section, we first generalized the lifted inequalities obtained in Chapter 5 to

general 0−1 bilinear covering sets that have additional linear terms. We then evaluate

the practical impacts of the lifted inequalities for 0−1 mixed-integer bilinear covering

sets inside of a cut-and-branch framework. To this end, we described heuristic separation

189

Tab

le6-4.

Perform

ance

oflifted

cuts

onsm

allsize

instan

ces

Instan

ceID

MILP

Nodes

LP

Cuts

LPCuts

CNodes

Gap

Imp.(%

)NodeRed.(%)

I-20-10-0

12110.92

430

11640.42

1211651.72

353

2.4012

17.9070

I-20-10-1

10727.59

010434.01

1410437.80

01.2904

N/A

I-20-10-2

11443.75

233

11040.38

1411055.03

913.6329

60.9442

I-20-10-3

11714.32

2311471.00

1111471.00

370.0000

-60.8696

I-20-10-4

12987.99

573

12370.21

1212370.21

531

0.0000

7.3298

I-20-10-5

15076.83

576

14610.38

1514633.53

505

4.9616

12.3264

I-20-10-6

10039.80

530

9613.31

169613.56

506

0.0598

4.5283

I-20-10-7

11516.01

6211114.44

1511132.76

414.5626

33.8710

I-20-10-8

15278.22

278

14510.59

1314510.59

291

0.0000

-4.6763

I-20-10-9

11529.00

530

10973.65

1210973.65

499

0.0000

5.8491

Average

12242.44

323.5

11777.84

13.4

11784.99

285.4

1.6908

8.5789

190

Tab

le6-5.

Perform

ance

oflifted

cuts

onmedium

size

instan

ces

Instan

ceID

MILP

Nodes

LP

Cuts

LPCuts

CNodes

Gap

Imp.(%

)NodeRed.(%)

I-50-15-0

49114.51

535

48345.95

3248345.95

527

0.0000

1.4953

I-50-15-1

43482.42

710

42981.57

3142981.57

526

0.0000

25.9155

I-50-15-2

43201.38

535

42661.15

3142661.15

526

0.0000

1.6822

I-50-15-3

44608.62

535

44041.24

3744041.24

532

0.0000

0.5607

I-50-15-4

47805.60

545

47088.45

2747088.45

531

0.0000

2.5688

I-50-15-5

46348.51

542

45715.33

3045715.33

528

0.0000

2.5830

I-50-15-6

46776.25

535

45870.52

3245870.52

527

0.0000

1.4953

I-50-15-7

50821.31

535

50206.13

2950206.13

524

0.0000

2.0561

I-50-15-8

49815.59

535

49087.57

3449087.57

529

0.0000

1.1215

I-50-15-9

46261.85

535

45884.68

2945884.68

524

0.0000

2.0561

Average

46823.60

554.2

46188.26

31.2

46188.26

527.4

0.0000

4.1535

191

Tab

le6-6.

Perform

ance

oflifted

cuts

onlargesize

instan

ces

Instan

ceID

MILP

Nodes

LP

Cuts

LPCuts

CNodes

Gap

Imp.(%

)NodeRed.(%)

I-100-20-0

137653.97

772

136534.76

58136534.76

561

0.0000

27.3316

I-100-20-1

134121.93

568

133153.23

67133153.23

510

0.0000

10.2113

I-100-20-2

139677.90

532

138565.78

56138565.78

505

0.0000

5.0752

I-100-20-3

130580.62

836

129397.91

62129397.91

499

0.0000

40.3110

I-100-20-4

136384.81

867

135383.48

66135383.48

572

0.0000

34.0254

I-100-20-5

144012.86

455

142557.89

61142557.89

476

0.0000

-4.6154

I-100-20-6

138801.76

855

137594.34

64137594.34

712

0.0000

16.7251

I-100-20-7

148099.13

511

146786.23

70146787.90

578

0.1272

-13.1115

I-100-20-8

148704.66

525

147200.39

61147200.39

546

0.0000

-4.0000

I-100-20-9

144269.42

455

143022.41

60143022.41

494

0.0000

-8.5714

Average

140230.71

637.6

139019.64

62.5

139019.81

545.3

0.0127

10.3381

192

procedures for the inequalities that we developed and presented computational results

using these procedures.

193

CHAPTER 7CONCLUSIONS AND FUTURE RESEARCH

In this thesis, we study two tools to improve current convexification methods

in MINLP. We develop a convexification tool that characterizes the convex hulls of

a nonlinear set whose convex hulls are completely determined by their orthogonal

disjunctions. In particular, we apply this tool to obtain the convex hulls of various bilinear

covering sets that appear as relaxations of MINLP problems. To handle the bounds on

variables, we study how lifting techniques can be used to derive strong valid inequalities

in 0−1 mixed-integer bilinear covering sets. Finally, we perform a computational study to

show that our lifted inequalities have the potential to improve the performance of solution

methods for MINLP problems. In Section 7.1, we summarize our research contributions

and their practical impact in solving MINLPs. In Section 7.2, we describe possible avenues

of future research.

7.1 Summary of Contributions

We summarize the major contributions of this thesis into three parts.

First, we derive a closed-form description for the convex hulls of nonlinear sets

whose convex hulls are completely determined by their restrictions over orthogonal

subspaces. While our convexification tool was developed using disjunctive programming

and convex extensions, it differs from prior approaches in that it does not introduce

auxiliary variables. We provide a toolbox of results to verify the technical assumptions

under which this convexification tool can be used. We also apply this tool for deriving the

split cut for mixed-integer programs. We then develop a fundamental result that extends

the applicability of the convexification tool to relaxing nonconvex constraints by providing

sufficient conditions for establishing the convex extension property. We illustrate how this

result can be used to derive the convex hull of a continuous bilinear covering set over the

non-negative orthant.

194

Second, we study 0−1 mixed-integer bilinear covering sets where the variables have

upper bounds. We show that these sets are polyhedral and provide characterizations of

their trivial facets. We then obtain a complete linear description of the convex hull of

these sets when they are defined by only two pairs of variables. Next, we derive three

families of facet-defining inequalities via sequence-independent lifting techniques. These

families are developed using the concept of a cover which is common in the integer

programming literature and selecting different lifting orders. We then show that 0−1

mixed-integer bilinear covering sets have polyhedral structures that are similar to those of

certain single-node flow sets. As a result, we obtain new facet-defining inequalities for flow

sets that generalize classical lifted flow cover inequalities.

Third, we consider the use of the lifted inequalities we derive for 0−1 mixed-integer

bilinear covering sets in a cut-and-branch algorithm. First, we generalize our lifted

inequalities to be valid for bilinear covering sets with additional linear terms. We describe

separation procedures for lifted bilinear cover inequalities and provide computational

results on randomly generated instances.

7.2 Future Research

We conclude this thesis by presenting some potential directions for future research.

1. Extension of orthogonal disjunctions theory to other classes of problems: We applied

our convexification tool to bilinear covering sets without upper bounds on variables.

However, the applicability of our tool can be extended to more general problems

such as polynomial covering sets [117]. We could also investigate how to handle

the bounds of variables in convexification procedures using orthogonal disjunctions.

In addition, since orthogonal disjunctions are closely related to complementarity

constraints, we plan to apply our convexification tools to obtain strong convex

relaxations of mathematical programs with complementarity constraints (MPCC).

2. Application of lifting tool to general nonlinear problems: In Chapter 5, we studied

0−1 mixed-integer bilinear covering sets and derived facet-defining inequalities using

195

lifting. Since lifting can be applied to problems with integer variables, a possible

direction is to generalize our lifted inequalities for bilinear covering sets with 0−1

variables to bilinear covering sets with general integer variables. Further, while we

consider a single constraint in our lifting procedures, we could also consider multiple

constraints simultaneously to obtain strong valid inequalities [139]. For example, for

the mixed-integer set defined by a bilinear equality constraint, i.e.,

BM =

{(x, y) ∈ {0, 1}n × [0, 1]n

∣∣∣n∑

j=1

ajxjyj = d

},

we could combine the lifting results of Chapter 5 with those of [100].

3. Computational study of lifted cuts in branch-and-cut framework: We performed a

preliminary computational study on randomly generated instances in Chapter 6.

A natural extension of our work is to evaluate the performance of other families of

lifted inequalities and to perform an extensive empirical study of the use of lifted

inequalities on real instances. Further, since our lifted inequalities are generalizations

of classical lifted flow cover inequalities for single-node flow sets, we could also

evaluate whether our cuts can help improve the performance of the branch-and-cut

algorithm on MIPLIB instances [2].

196

APPENDIX ALINEAR DESCRIPTION OF THE CONVEX HULL OF A BILINEAR SET

The linear description of conv(B) is obtained by PORTA as the following:

B ={(x, y) ∈ {0, 1}4 × [0, 1]4

∣∣∣ 19x1y1 + 17x2y2 + 15x3y3 + 10x4y4 ≥ 20}

(1) 50x1 +90x3 +45x4 +76y1 +153y2 ≥ 135

(2) 70x1 +90x2 +27x4 +38y1 +135y3 ≥ 117

(3) 25x1 +65x3 +45x4 +76y1 +153y2 ≥ 110

(4) +50x2 +70x3 +35x4 +133y1 +34y2 ≥ 105

(5) +25x2 +45x3 +35x4 +133y1 +34y2 ≥ 80

(6) 21x1 +41x2 +27x4 +38y1 +135y3 ≥ 68

(7) 30x1 +35x2 +21x3 +19y1 +70y4 ≥ 56

(8) 18x1 +23x2 +21x3 +19y1 +70y4 ≥ 44

(9) 19x1 +17x2 +15y3 +10y4 ≥ 20

(10) 19x1 +15x3 +17y2 +10y4 ≥ 20

(11) 19x1 +10x4 +17y2 +15y3 ≥ 20

(12) 19x1 +17y2 +15y3 +10y4 ≥ 20

(13) +17x2 +15x3 +19y1 +10y4 ≥ 20

(14) +17x2 +10x4 +19y1 +15y3 ≥ 20

(15) +17x2 +19y1 +15y3 +10y4 ≥ 20

(16) +15x3 +10x4 +19y1 +17y2 ≥ 20

(17) +15x3 +19y1 +17y2 +10y4 ≥ 20

(18) +10x4 +19y1 +17y2 +15y3 ≥ 20

(19) +19y1 +17y2 +15y3 +10y4 ≥ 20

(20) 14x1 +10x3 +5x4 +17y2 ≥ 15

(21) +12x2 +10x3 +5x4 +19y1 ≥ 15

(22) +10x3 +5x4 +19y1 +17y2 ≥ 15

197

(23) 12x1 +10x2 +3x4 +15y3 ≥ 13

(24) +10x2 +10x3 +3x4 +19y1 ≥ 13

(25) +10x2 +3x4 +19y1 +15y3 ≥ 13

(26) 10x1 +10x2 +x4 +15y3 ≥ 11

(27) 10x1 +10x3 +x4 +17y2 ≥ 11

(28) 10x1 +x4 +17y2 +15y3 ≥ 11

(29) 7x1 +5x2 +3x3 +10y4 ≥ 8

(30) +5x2 +3x3 +19y1 +10y4 ≥ 8

(31) +5x2 +3x3 +5x4 +19y1 ≥ 8

(32) +3x2 +3x3 +3x4 +19y1 ≥ 6

(33) 5x1 +x3 +17y2 +10y4 ≥ 6

(34) 5x1 +5x2 +x3 +10y4 ≥ 6

(35) 5x1 +x3 +5x4 +17y2 ≥ 6

(36) 3x1 +x2 +15y3 +10y4 ≥ 4

(37) 3x1 +x2 +3x3 +10y4 ≥ 4

(38) 3x1 +x2 +3x4 +15y3 ≥ 4

(39) x1 +x2 +x3 +10y4 ≥ 2

(40) x1 +x2 +x4 +15y3 ≥ 2

(41) x1 +x3 +x4 +17y2 ≥ 2

(42) x1 +x2 +x3 +x4 ≥ 2

(43) x1 ≥ 0

(44) x2 ≥ 0

(45) x3 ≥ 0

(46) x4 ≥ 0

(47) y1 ≥ 0

(48) y2 ≥ 0

(49) y3 ≥ 0

198

(50) y4 ≥ 0

(51) y4 ≤ 1

(52) y3 ≤ 1

(53) y2 ≤ 1

(54) y1 ≤ 1

(55) x4 ≤ 1

(56) x3 ≤ 1

(57) x2 ≤ 1

(58) x1 ≤ 1

199

APPENDIX BLINEAR DESCRIPTION OF THE CONVEX HULL OF A FLOW SET

The linear description of conv(F ) is obtained by PORTA as the following:

F ={(x, y) ∈ {0, 1}4 × [0, 1]4

∣∣∣ 19y1 + 17y2 + 15y3 + 10y4 ≥ 20, xj ≥ yj ∀j = 1, . . . , 4}

(1) 50x1 +90x3 +45x4 +76y1 +153y2 ≥ 135

(2) 70x1 +90x2 +27x4 +38y1 +135y3 ≥ 117

(3) 25x1 +65x3 +45x4 +76y1 +153y2 ≥ 110

(4) +50x2 +70x3 +35x4 +133y1 +34y2 ≥ 105

(5) +25x2 +45x3 +35x4 +133y1 +34y2 ≥ 80

(6) 21x1 +41x2 +27x4 +38y1 +135y3 ≥ 68

(7) 30x1 +35x2 +21x3 +19y1 +70y4 ≥ 56

(8) 18x1 +23x2 +21x3 +19y1 +70y4 ≥ 44

(19) +19y1 +17y2 +15y3 +10y4 ≥ 20

(22) +10x3 +5x4 +19y1 +17y2 ≥ 15

(24) +10x2 +10x3 +3x4 +19y1 ≥ 13

(25) +10x2 +3x4 +19y1 +15y3 ≥ 13

(26) 10x1 +10x2 +x4 +15y3 ≥ 11

(27) 10x1 +10x3 +x4 +17y2 ≥ 11

(28) 10x1 +x4 +17y2 +15y3 ≥ 11

(30) +5x2 +3x3 +19y1 +10y4 ≥ 8

(31) +5x2 +3x3 +5x4 +19y1 ≥ 8

(32) +3x2 +3x3 +3x4 +19y1 ≥ 6

(33) 5x1 +x3 +17y2 +10y4 ≥ 6

(34) 5x1 +5x2 +x3 +10y4 ≥ 6

(35) 5x1 +x3 +5x4 +17y2 ≥ 6

(36) 3x1 +x2 +15y3 +10y4 ≥ 4

200

(37) 3x1 +x2 +3x3 +10y4 ≥ 4

(38) 3x1 +x2 +3x4 +15y3 ≥ 4

(39) x1 +x2 +x3 +10y4 ≥ 2

(40) x1 +x2 +x4 +15y3 ≥ 2

(41) x1 +x3 +x4 +17y2 ≥ 2

(42) x1 +x2 +x3 +x4 ≥ 2

(47) y1 ≥ 0

(48) y2 ≥ 0

(49) y3 ≥ 0

(50) y4 ≥ 0

(55) x4 ≤ 1

(56) x3 ≤ 1

(57) x2 ≤ 1

(58) x1 ≤ 1

(f1) x1 −y1 ≥ 0

(f2) x2 −y2 ≥ 0

(f3) x3 −y3 ≥ 0

(f4) x4 −y4 ≥ 0

201

REFERENCES

[1] Achterberg, T., T. Koch, T. Martin. 2005. Branching rules revisited. OperationsResearch Letters 33 42–54.

[2] Achterberg, Tobias, Thorsten Koch, Alexander Martin. 2006. MIPLIB 2003.Operations Research Letters 34 361–372.

[3] Adjiman, C. S., I. P. Androulakis, C. A. Floudas. 1998. A global optimizationmethod, αBB, for general twice-differentiable constrained NLPs–II. Implementationand computational results. Computers and Chemical Engineering 22 1159–1179.

[4] Al-Khayyal, F. A., J. E. Falk. 1983. Jointly constrained biconvex programming.Mathematics of Operations Research 8 273–286.

[5] Androulakis, I. P., C. D. Maranas, C. A. Floudas. 1995. αBB: A global optimizationmethod for general constrained nonconvex problems. Journal of Global Optimization7 337–363.

[6] Atamturk, A. 2001. Flow pack facets of the single node fixed-charge flow polytope.Operations Research Letters 29 107–114.

[7] Atamturk, A. 2003. On the facets of the mixed-integer knapsack polyhedron.Mathematical Programming 98 145–175.

[8] Atamturk, A. 2004. Sequence independent lifting for mixed-integer programming.Operations Research 52 487–490.

[9] Atamturk, A. 2006. Strong formulations of robust mixed 0−1 programming.Mathematical Programming 108 235–250.

[10] Atamturk, A., V. Narayanan. 2007. Lifting for conic mixed-integer programming.Research Report BCOL.07.04, IEOR, University of California-Berkeley. Forthcomingin Mathematical Programming.

[11] Atamturk, A., D. Rajan. 2002. On splittable and unsplittable capacitated networkdesign arc-set polyhedra. Mathematical Programming 92 315–333.

[12] Balas, E. 1971. Intersection cuts - a new type of cutting planes for integerprogramming. Operations Research 19 19–39.

[13] Balas, E. 1975. Disjunctive programming: Cutting planes from logical conditions.O. L. Mangasarian, R. R. Meyer, S. M. Robinson, eds., Nonlinear Programming .Academic Press, NY, 279–312.

[14] Balas, E. 1975. Facets of the knapsack polytope. Mathematical Programming 8146–164.

[15] Balas, E. 1979. Disjunctive programming. Annals of Discrete Mathematics 5 3–51.

202

[16] Balas, E. 1985. Disjunctive programming and a hierarchy of relaxations for discreteoptimization problems. SIAM Journal on Algebraic and Discrete Methods 6 466–486.

[17] Balas, E. 1998. Disjunctive programming: Properties of the convex hull of feasiblepoints. Discrete Applied Mathematics 89(1-3) 3–44. Original manuscript waspublished as a technical report in 1974.

[18] Balas, E. 2005. Projection, lifting and extended formulation in integer andcombinatorial optimization. Annals of Operations Research 140 125–161.

[19] Balas, E., A. Bockmayr, N. Pisaruk, L. Wolsey. 2004. On unions and dominants ofpolytopes. Mathematical Programming 99 223–239.

[20] Balas, E., S. Ceria, G. Cornuejols. 1993. A lift-and-project cutting plane algorithmfor mixed 0−1 programs. Mathematical Programming 58 295–324.

[21] Balas, E., S. Ceria, G. Cornuejols. 1996. Mixed 0−1 programming by lift-and-projectin a branch-and-cut framework. Management Science 42 1229–1246.

[22] Balas, E., M. Perregaard. 2003. A precise correspondence between lift-and-projectcuts, simple disjunctive cuts, and mixed-integer gomory cuts for 0−1 programming.Mathematical Programming 94 221–245.

[23] Balas, E., E. Zemel. 1978. Facets of the knapsack polytope from minimal covers.SIAM Journal on Applied Mathematics 34 119–148.

[24] Bazaraa, M. S., H. D. Sherali, C. M. Shetty. 2006. Nonlinear Programming: Theoryand Algorithms . 3rd ed. John Wiley & Sons, New York, NY.

[25] Belotti, P. 2009. Design of telecommunication networks with shared protection.Available at http://www.minlp.org/library/problem/index.php?i=51.

[26] Belotti, P., J. Lee, L. Liberti, F. Margot, A. Wachter. 2009. Branching and boundstightening techniques for non-convex MINLP. Optimization Methods and Software24 597–634. Available at http://www.optimization-online.org/DB_HTML/2008/08/2059.html.

[27] Biegler, L. T., I. E. Grossmann, A. W. Westerberg. 1997. Systematic Methods ofChemical Process Design. Prentice Hall, Uppder Saddle River (NJ).

[28] Bland, R. G. 1977. New finite pivoting rule for the simplex method. Mathematics ofOperations Research 2 103–107.

[29] Boyd, S., L. Vandenberghe. 2004. Convex Optimization. Cambridge UniversityPress, Cambridge.

[30] Ceria, S., G. Cordier, H. Marchand, L. A. Wolsey. 1999. Cutting planes for integerprograms with general integer variables. Mathematical Programming 81 201–214.

203

http://www.minlp.org/library/problem/index.php?i=51

http://www.optimization-online.org/DB_HTML/2008/08/2059.html

http://www.optimization-online.org/DB_HTML/2008/08/2059.html

[31] Ceria, S., J. Soares. 1999. Convex programming for disjunctive convex optimization.Mathematical Programming 86A 595–614.

[32] Christof, T., A. Lobel. 1997. PORTA: POlyhedron Representation TransformationAlgorithm. Available at http://www.zib.de/Optimization/Software/Porta/.

[33] Chung, K., J.-P. P. Richard, M. Tawarmalani. 2010. A computational study for liftedinequalities for 0−1 mixed-integer bilinear covering sets. Working paper.

[34] Chung, K., J.-P. P. Richard, M. Tawarmalani. 2010. Lifted inequalities for 0−1mixed-integer bilinear covering sets. Working paper.

[35] Cook, S. A. 1971. The complexity of theorem-proving procedures. Proceedings ofthe Third Annual ACM Symposium on the Theory of Computing . ACM, New York,151–158.

[36] Cook, W., R. Kannan, A. Schrijver. 1990. Chvatal closures for mixed integerprogramming problems. Mathematical Programming 47 155–174.

[37] Cornuejols, G. 2008. Valid inequalities for mixed integer linear programs. Mathemat-ical Programming 112 3–44.

[38] Cornuejols, G., C. Lemarechal. 2006. A convex-analysis perspective on disjunctivecuts. Mathematical Programming 106 567–586.

[39] Cornuejols, G., R. Tutuncu. 2006. Optimization Methods in Finance. CambridgeUniversity Press.

[40] CPLEX. 2007. CPLEX 11.1 User’s Manual . ILOG Inc., Mountain View, CA.

[41] Crama, Y. 1993. Concave extensions for nonlinear 0−1 maximization problems.Mathematical Programming 61 53–60.

[42] Crowder, H. P., E. L. Johnson, M. W. Padberg. 1983. Solving large-scale zero-onelinear programming problems. Operations Research 31 803–834.

[43] Dakin, R. J. 1965. A tree search algorithm for mixed-integer programming problems.Computer Journal 8 250–255.

[44] Danzig, G. B. 1951. Maximization of a linear function of variables subject to linearinequalities. T.C. Koopmans, ed., Activity Analysis of Production and Allocation.Wiley N.Y., 339–347.

[45] Danzig, G. B., R. Fulkerson, S. Johnson. 1954. Solution of a large-scale travelingsalesman problem. Operations Research 2 393–410.

[46] de Farias, I. R., E. L. Johnson, G. L. Nemhauser. 2002. Facets of thecomplementarity knapsack polytope. Mathematics of Operations Research 27210–226.

204

http://www.zib.de/Optimization/Software/Porta/

[47] Driebeek, N. J. 1965. An algorithm for the solution of mixed-integer programmingproblems. Management Science 12 576–587.

[48] Duran, M. A., I. E. Grossmann. 1986. An outer-approximation algorithm for a classof mixed-integer nonlinear programs. Mathematical Programming 36 307–339.

[49] Falk, J. E. 1969. Lagrange multipliers and nonconvex programs. SIAM Journal onControl 7 534–545.

[50] Falk, J. E., K. L. Hoffman. 1976. A successive underestimation methods for concaveminimization problems. Mathematics of Operations Research 1 251–259.

[51] Falk, J. E., R. M. Soland. 1969. An algorithm for separable nonconvex programmingproblems. Management Science 15 550–569.

[52] Fiacco, A. V., G. P. McCormick. 1990. Nonlinear Programming. Sequential Uncon-strained Minimization Techniques . Soiety for Industrial and Applied Mathematics.First published in 1968 by Research Analysis Corporation.

[53] Floudas, C. A. 2001. Global optimziation in design and control of chemical processsystems. Journal of Process Control 10 125–134.

[54] Fourier, J. B. J. 1826. Solution d’une question particuliere du calcul des inegalites.Nouveau Bulletin des Sciences par la Societe Philomatique de Paris 317–319.

[55] Fukuda, K., T. M. Liebling, C. Lutolf. 2001. Extended convex hull. Computationalgeometry 20 13–23.

[56] Garey, M. R., D. S. Johnson. 1979. Computers and Intractability: A Guide to theTheory of NP-Completeness . W.H. Freeman.

[57] Gomory, R. E. 1958. Outline of an algorithm for integer solutions to linear programs.Bulletin of the American Mathematical Society 64 275–278.

[58] Gomory, R. E. 1969. Some polyhedra related to combinatorial problems. LinearAlgebra and Its Applications 2 451–558.

[59] Grotschel, M., L. Lovasz, A. Schrijver. 1988. Geometric Algorithms and Combinato-rial Optimization. Springer-Verlag, Berlin, Germany.

[60] Gu, Z., G. L. Nemhauser, M. W. P. Savelsbergh. 1998. Lifted cover inequalities for0−1 integer programs: Computation. INFORMS Journal on Computing 10 427–437.

[61] Gu, Z., G. L. Nemhauser, M. W. P. Savelsbergh. 1999. Lifted flow cover inequalitiesfor mixed 0−1 integer programs. Mathematical Programming 85 439–467.

[62] Gu, Z., G. L. Nemhauser, M. W. P. Savelsbergh. 2000. Sequence independent liftingin mixed integer programming. Journal of Combinatorial Optimization 4 109–129.

205

[63] Hammer, P. L., E. L. Johnson, U. N. Peled. 1975. Facets of regular 0−1 polytopes.Mathematical Programming 8 179–206.

[64] Harjunkoski, I., T. Westerlund, R. Porn, H. Skrifvars. 1998. Differenttransformations for solving non-convex trim-loss problems by MINLP. EuropeanJournal of Operational Research 105 594–603.

[65] Hoffman, K. 1981. A method for globally minimizing concave functions over convexsets. Mathematical Programming 20 22–32.

[66] Horst, R. 1976. An algorithm for nonconvex programming problems. MathematicalProgramming 10 312–321.

[67] Horst, R., P. M. Pardalos. 1995. Handbook of Global Optimization. Kluwer AcademicPublishers.

[68] Horst, R., N. V. Thoai, H. Tuy. 1989. On an outer-approximation in globaloptimization. Optimization 20 255–264.

[69] Horst, R., H. Tuy. 1996. Global Optimization: Deterministic Approaches . Third ed.Springer Verlag, Berlin.

[70] Kallrath, J. 2005. Solving planning and design problems in the process industryusing mixed integer and global optimization. Annals of Operations Research 140339–373.

[71] Kan, A. H. G. Rinnooy, G. T. Timmer. 1987. Stochastic global optimizationmethods I: Clustering methods. Mathematical Programming 39 27–56.

[72] Kantorovich, L. V. 1960. Mathematical methods in the organization and planning ofproduction. Management Science 6 366–422.

[73] Karmarkar, N. 1984. A new polynomial time algorithm for linear programming.Combinatorica 4 375–395.

[74] Khachiyan, L. G. 1979. A polynomial time algorithm for linear programming. SovietMathematics Doklady 20 191–194.

[75] Land, A. H., A. G. Doig. 1960. An automatic method for solving discreteprogramming problems. Econometrica 28 497–520.

[76] Lenstra, H. W. 1983. Integer programming with a fixed number of variables.Mathematics of Operations Research 8 538–548.

[77] Liberti, L., C. Lavor, N. Maculan. 2008. A branch-and-bound algorithm for themolecular distance geometry problem. International Transactions in OperationalResearch 15 1–17.

206

[78] Liberti, L., C. Lavor, N. Maculan, M-A. C. Nascimento. 2009. Reformulation inmathematical programming: an application to quantum chemisty. Discrete ApppliedMathematics 6 1309–1318.

[79] Linderoth, J. T., M. W. P. Savelsbergh. 1999. A computational study of strategiesfor mixed integer programming. INFORMS Journal on Computing 11 173–187.

[80] LINDO Systems Inc. 2008. LINGO 11.0 optimization modeling software for linear,nonlinear, and integer programming. Available at http://www.lindo.com.

[81] Louveaux, Q., L. A. Wolsey. 2007. Lifting, superadditivity, mixed integer roundingand single node flow sets revisited. Annals of Operations Research 153 47–77.

[82] Lovasz, L., A. Schrijver. 1991. Cones of matrices and set functions and 0−1optimization. SIAM Journal on Optimization 1 166–190.

[83] Marchand, H., L. A. Wolsey. 1999. The 0−1 knapsack problem with a singlecontinuous variable. Mathematical Programming 85 15–33.

[84] Martin, A. 2001. General mixed integer programming: Computational issuesfor branch-and-cut algorithms. D. Naddef, M. Juenger, eds., ComputationalCombinatorial Optimization. Springer.

[85] McCormick, G. P. 1976. Computability of global solutions to factorable nonconvexprograms: Part I - convex underestimating problems. Mathematical Programming 10147–175.

[86] McCormick, G. P. 1983. Nonlinear Programming: Theory, Algorithms, and Applica-tions . John Wiley and Sons.

[87] Meyer, R. R. 1974. On the existence of optimal solutions to integer andmixed-integer programming problems. Mathematical Programming 7 223–235.

[88] Minkowski, H. 1896. Geometric der zahlen. Working paper.

[89] Mitsos, A., B. Chachuat, P. I. Barton. 2009. McCormick-based relaxations ofalgorithms. SIAM Journal on Optimization 20 573–601.

[90] Mittelmann, H. 2010. Benchmarks for optimization software. Available at http://plato.asu.edu/bench.html.

[91] Nemhauser, G. L., L. A. Wolsey. 1988. Integer and Combinatorial Optimization.Wiley Interscience, New York.

[92] Nesterov, Y., A. Nemirovskii. 1994. Interior-Point Polynomial Algorithms in ConvexProgramming . SIAM.

[93] Neumaier, A. 1997. Molecular modeling of proteins and mathematical prediction ofprotein structure. SIAM Review 39 407–460.

207

http://www.lindo.com

http://plato.asu.edu/bench.html

http://plato.asu.edu/bench.html

[94] Neumaier, A. 2004. Complete search in continuous global optimization andconstraint satisfaction. Acta Numerica 13 271–369.

[95] Padberg, M. W. 1973. On the facial structure of set packing polyhedra. Mathemati-cal Programming 5 199–215.

[96] Padberg, M. W. 1975. A note on zero-one programming. Operations Research 23833–837.

[97] Padberg, M. W., T. J. Van Roy, L. A. Wolsey. 1985. Valid linear inequalities forfixed charge problems. Operations Research 33 842–861.

[98] Richard, J.-P. P., I. R. de Faris, G. L. Nemhauser. 2003. Lifted inequalities for0−1 mixed integer programmming: Basic theory and algorithms. MathematicalProgramming 98 89–113.

[99] Richard, J.-P. P., I. R. de Faris, G. L. Nemhauser. 2003. Lifted inequalities for 0−1mixed integer programmming: Superlinear lifting. Mathematical Programming 98115–143.

[100] Richard, J.-P. P., M. Tawarmalani. 2010. Lifting inequalities: A framework forgenerating strong cuts for nonlinear programs. Mathematical Programming 12161–104.

[101] Rikun, A. D. 1997. A convex envelope formula for multilinear functions. Journal ofGlobal Optimization 10 425–437.

[102] Rockafellar, R. T. 1970. Convex Analysis . Princeton University Press.

[103] Ryoo, H. S., N. V. Sahinidis. 1996. A branch-and-reduce approach to globaloptimization. Journal of Global Optimization 8 107–139.

[104] Ryoo, H. S., N. V. Sahinidis. 2001. Analysis of bounds for multilinear functions.Journal of Global Optimization 19 403–424.

[105] Sahinidis, N. V., M. Tawarmalani. 2005. BARON . The Optimization Firm, LLC,Urbana-Champaign, IL. Available at http://www.gams.com/dd/docs/solvers/baron.pdf.

[106] Savelsbergh, M. W. P. 1994. Preprocessing and probing for mixed integerprogramming problems. ORSA Journal on Computing 6 445–454.

[107] Sawaya, N. W., I. E. Grossmann. 2005. A cutting plane method for solving lineargeneralized disjunctive programming problems. Computers and Chemical Engineer-ing 29 1891–1913.

[108] Schrijver, A. 1986. Theory of Linear and Integer Programming . John Wiley & Sons,Chichester.

208

http://www.gams.com/dd/docs/solvers/baron.pdf

http://www.gams.com/dd/docs/solvers/baron.pdf

[109] Shectman, J. P., N. V. Sahinidis. 1998. A finite algorithm for global minimization ofseparable concave programs. Journal of Global Optimization 12 1–36.

[110] Sherali, H. D., W. P. Adams. 1990. A hierarchy of relaxations between thecontinuous and convex hull representations for zero-one programming problems.SIAM Journal on Discrete Mathematics 3 411–430.

[111] Sherali, H. D., S. Sen. 1985. Cuts from combinatorial disjunctions. OperationsResearch 33 928–933.

[112] Shor, N. Z. 1977. Cut-off method with space extension in convex programmingproblems. Cybernetics 13 94–96.

[113] Stubbs, R., S. Mehrotra. 1999. A branch-and-cut method for 0−1 mixed convexprogramming. Mathematical Programming 86 515–532.

[114] Tawarmalani, M., , S. Ahmed, N. V. Sahinidis. 2002. Global optimization of 0−1hyperbolic programs. Journal of Global Optimization 24 385–417.

[115] Tawarmalani, M., , S. Ahmed, N. V. Sahinidis. 2002. Product disaggregation andrelaxations of mixed-integer rational programs. Journal of Global Optimization 24385–417.

[116] Tawarmalani, M. 2001. Mixed integer nonlinear programs: Theory, algorithms, andapplications. Ph.D. thesis, University of Illinois, Urbana-Champaign, IL.

[117] Tawarmalani, M., J.-P. P. Richard, K. Chung. 2008. Strong Valid Inequalities forOrthogonal Disjunctions and Polynomial Covering Sets. Technical Report, KrannertSchool of Management, Purdue University.

[118] Tawarmalani, M., J.-P. P. Richard, K. Chung. 2010. Strong valid inequalities fororthogonal disjunctions and bilinear covering sets. Mathematical ProgrammingForthcoming.

[119] Tawarmalani, M., N. V. Sahinidis. 2001. Semidefinite relaxations of fractionalprograms via novel techniques for constructing convex envelopes of nonlinearfunctions. Journal of Global Optimization 2001 137–158.

[120] Tawarmalani, M., N. V. Sahinidis. 2002. Convex extensions and envelopes of lowersemi-continuous functions. Mathematical Programming 93 247–263.

[121] Tawarmalani, M., N. V. Sahinidis. 2002. Convexification and Global Optimizationin Continuous and Mixed-Integer Nonlinear Programming: Theory, Algorithms,Software, and Applications . Kluwer, Dordrecht, The Netherlands.

[122] Tawarmalani, M., N. V. Sahinidis. 2004. Global optimization of mixed-integernonlinear programs: A theoretical and computational study. Mathematical Program-ming 99 563–591.

209

[123] Tawarmalani, M., N. V. Sahinidis. 2005. A polyhedral branch-and-cut approach toglobal optimization. Mathematical Programming 103 225–249.

[124] Tuy, H. 1985. A concave programming under linear constraints. Doklady AkademicNauk 159 32–35.

[125] Tuy, H. 1987. Global optimization of a difference of two convex functions. Mathe-matical Programming Study 30 150–182.

[126] Tuy, H., T. V. Thieu, N. Q. Thai. 1985. A conical algorithm for globally minimizinga concave function over a closed convex set. Mathematics of Operations Research 10498–514.

[127] Van Roy, T. J., L. A. Wolsey. 1986. Valid inequalities for mixed 0−1 programs.Discrete Applied Mathematics 14 199–213.

[128] Vandenbussche, D., G. L. Nemhauser. 2005. A polyhedral study of nonconvexquadratic programs with box constraints. Mathematical Programming 102 531–557.

[129] Visweswaran, V., C. A. Floudas. 1993. New properties and computationalimprovement of the GOP algorithm for problems with quadratic objective functionsand constraints. Journal of Global Optimization 3 439–462.

[130] Wolsey, L. A. 1975. Faces for a linear inequality in 0−1 variables. MathematicalProgramming 8 165–178.

[131] Wolsey, L. A. 1976. Facets and strong valid inequalities for integer programs.Operations Research 24 362–372.

[132] Wolsey, L. A. 1977. Valid inequalities and superadditivity for 0−1 integer programs.Mathematics of Operations Research 2 66–77.

[133] Wright, M. H. 2004. The interior-point revolution in optimiation: History, recentdevelopments, and lasting consequences. Bulletin of the American MathematicalSociety 42 39–56.

[134] Wu, S.-J., P.-T. Chow. 1995. Genetic algorithms for nonlinear mixed discrete-integeroptimization problems via meta-genetic parameter optimization. EngineeringOptimization 24 137–159.

[135] You, F., I. E. Grossmann. 2009. Mixed-integer nonlinear programming models andalgorithms for supply chain design with stochastic inventory management. Availableat http://www.minlp.org/library/problem/index.php?i=30.

[136] Yudin, D. B., A. S. Nemirovski. 1976. Informational complexity and effectivemethods of solution of convex extremal problems. Economics and MathematicalMethods 12 357–369.

210


[137] Zamora, J. M., I. E. Grossmann. 1999. A branch and contract algorithm forproblems with concave univariate, bilinear and linear fractional terms. Journal ofGlobal Optimization 14 217–249.

[138] Zemel, E. 1978. Lifting the facets of zero-one polytopes. Mathematical Programming15 268–277.

[139] Zeng, B. 2007. Efficient lifting methods for unstructured mixed integer programswith multiple constraints. Ph.D. thesis, Purdue University, West Lafayette, IN.

[140] Zhang, C., H.-P. Wang. 1993. Mixed-discrete nonlinear optimization with simulatedannealing. Engineering Optimization 21 277–291.

[141] Ziegler, G. M. 1998. Lectures on Polytopes . Springer, NY.

[142] Zondervan, E. 2009. A deterministic security constrained unit commitment model.Available at http://www.minlp.org/library/problem/index.php?i=41.

211


BIOGRAPHICAL SKETCH

Kwanghun Chung was born in Iksan, Korea and spent most of his life in Daejeon,

Korea. He received both Bachelor of Science and Master of Science degrees in industrial

engineering from Seoul National University in 1997 and 1999, respectively. After working

as a software engineer in the Research and Development Center at Samsung SDS from

1999 to 2004, he joined the School of Industrial Engineering at Purdue University as a

Ph.D. student in 2004. His doctoral research focuses on developing and improving solution

methodologies for Mixed-Integer Nonlinear Programs. In 2008, he transferred to the

Department of Industrial and Systems Engineering at the University of Florida (UF)

following his advisor, Dr. Richard. After receiving a Doctor of Philosophy degree in the

area of Operations Research from UF on August 2010, he joined the Center for Operations

Research and Econometrics (CORE) at the Universite Catholique de Louvain in Belgium

as a postdoctoral fellow.

212

STRONG VALID INEQUALITIES FOR MIXED-INTEGER NONLINEAR...

Documents

Transcript of STRONG VALID INEQUALITIES FOR MIXED-INTEGER NONLINEAR...