LAGRANGIAN RELAXATION APPROACHES TO ......Portfolio selection with cardinality constraint is a...

LAGRANGIAN RELAXATION APPROACHES TOCARDINALITY CONSTRAINED PORTFOLIO SELECTION

by

Dexiang Wu

A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy

Graduate Department of Mechanical & Industrial EngineeringUniversity of Toronto

© Copyright 2016 by Dexiang Wu

Abstract

LAGRANGIAN RELAXATION APPROACHES TO

CARDINALITY CONSTRAINED PORTFOLIO SELECTION

Dexiang Wu

Doctor of Philosophy

Graduate Department of Mechanical & Industrial Engineering

University of Toronto

2016

Portfolio selection with cardinality constraint is a process that creates a strict subset of

assets from a large selection pool. The advantage of cardinality constraint is that fewer assets

can reduce transaction costs and complexity of asset management. Also, this type of constraint

can be used to mimic a benchmark portfolio (index) such as S&P 500. In this dissertation

we study two different cardinality constrained portfolio selection problems, known as Index

Tracking and Financial Planning.

Index Tracking is a typical application of the cardinality constrained portfolio selection

process and has attracted much attention from portfolio managers. However, replicating un-

predictable market indices using limited available resource requires advanced modelling and

optimization techniques in practice. This thesis aims to qualitatively investigate and analyze

different types of index tracking problems and the associated optimal strategies.

Firstly, we construct the tracking portfolio via a constrained clustering approach which con-

siders various practical aspects such as transaction costs, turnover, and sector limits constraints.

We show that the portfolio allocation can diversify between different sectors and reduce the

portfolio risk fairly well. Next we address a cardinality constrained Financial Planning problem

through Stochastic Mixed Integer Programming and extend the network flow structured frame-

work to index tracking problem. Finally, we incorporate the cardinality restriction to a classical

mean-variance based tracking model and build the robust counterpart via Robust Optimization.

ii

All developed models demand problem solvability due to the rapid increase in the number

of variables and constraints for tracking real indices such as S&P 500. We design three dual

decomposition algorithms, which allow different specific heuristics to be embedded, to quickly

obtain high quality solutions for associated models. For example, Tabu Search was applied to

solve the scenario sub-problems to speed up the Progressive Hedging algorithm for cardinality

constrained financial planning problems. Our designed models are general enough to extend

to many other management applications, and our accompanied decomposition algorithms are

efficient enough to handle the cardinality constraint in these problems. The generated portfolios

illustrate the effectiveness of our selection technologies and designed algorithms in terms of

different performance metrics with respect to the market.

iii

Dedication

To Tina and Mandy

iv

Acknowledgements

This dissertation would not have been possible without the support of many remarkable people

to whom I would like to express my sincere gratitude.

First and foremost, I would like to thank my supervisor, Professor Roy H. Kwon, for his

consistent support of my Ph.D study and related research, for his patience, inspiration, and im-

mense knowledge, and for many appropriate advices that improve the quality and contribution

of my papers. His guidance helped me in all the time of research and writing of this thesis. I

could not have imagined having a better advisor and mentor for my Ph.D study.

Besides my supervisor, I want to thank Professor Yuri Lawryshyn and Professor Timothy

Chan for their insightful comments and wonderful suggestions to improve my research from

various perspectives while serving on my supervising committee. I also want to thank Professor

Oleksandr Romanko and Professor Hani Naguib for their time and remarks as members of the

examination committee. Also, I would like to thank Professor Seong Moon Kim for taking time

out from his busy schedule to serve as my external reviewer.

I would like to thank all the members of the University of Toronto Operations Research

Group (UTORG), which provides me many excellent opportunities to meet with unique in-

dividuals from all over the world. Finally, I appreciate the financial support from CSC that

funded parts of my studies.

v

Contents

1 Introduction and Thesis Outline 1

1.1 Background of Portfolio Optimization . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Research Objective and Contribution . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Modern Portfolio Theory and Index Tracking 11

2.1 Literature review for MVO and Its Extension . . . . . . . . . . . . . . . . . . . . 13

2.2 Literature review for Index Tracking . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Lagrangian Relaxation in Literature 22

3.1 Metaheuristics in Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2 Literature review for LR and Its Extension . . . . . . . . . . . . . . . . . . . . . 24

4 A Constrained Clustering Approach for Index Tracking 28

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.2 Literature Review for Index Tracking . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.3 Model Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.3.1 Basic cluster-based index tracking model . . . . . . . . . . . . . . . . . . 32

4.3.2 Model with buy-in threshold and turnover constraints . . . . . . . . . . . 34

4.3.3 Basic model with sector limits . . . . . . . . . . . . . . . . . . . . . . . . 36

4.3.4 The model with trading and sector diversification constraints . . . . . . . 37

4.3.5 Tractability of the cluster-based Models . . . . . . . . . . . . . . . . . . . 39

4.4 Lagrangian Relaxation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.5 Computational Results: Tracking the S&P500 . . . . . . . . . . . . . . . . . . . . 49

vi

4.5.1 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.5.2 LR versus SLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.5.3 Comparison between 4 models . . . . . . . . . . . . . . . . . . . . . . . . 52

4.6 Conclusions and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5 Progressive Hedging for Cardi. Constrained FP 66

5.1 Introduction to Financial Planning Problem . . . . . . . . . . . . . . . . . . . . . 66

5.2 Model Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

5.2.1 Equivalent Cardinality Constrained FP Models . . . . . . . . . . . . . . . 68

5.2.2 Scenario Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5.3 Lagrangian Decomposition Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.3.1 LR method for scenario sub-problem . . . . . . . . . . . . . . . . . . . . . 77

5.3.2 Tabu search for scenario sub-problem . . . . . . . . . . . . . . . . . . . . 79

5.4 Progressive Hedging for FP problem . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.4.1 Design a lower bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.4.2 Progressive Hedging method . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.4.3 Numerical experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.5 Progressive Hedging for Index Tracking problem . . . . . . . . . . . . . . . . . . 88


6 Lagrangian Relaxation for CCCP 93

6.1 Introduction to CCCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

6.3 Lagrangian Relaxation Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.4 Robust Factor model to Index Tracking . . . . . . . . . . . . . . . . . . . . . . . 103

6.4.1 Nominal Index Tracking Model . . . . . . . . . . . . . . . . . . . . . . . 103

6.4.2 Robust Multi-Factor Model for Index Tracking . . . . . . . . . . . . . . . 107

6.5 Computational Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6.5.1 Testing the Three-Factor and Single-Factor models . . . . . . . . . . . . . 112

6.5.2 Index Tracking using the S&P100 Index . . . . . . . . . . . . . . . . . . . 114

6.5.3 Index Tracking using the S&P500 Index . . . . . . . . . . . . . . . . . . . 128

vii

6.5.4 Index Tracking using the Russell 1000 Index . . . . . . . . . . . . . . . . 131

6.5.5 Index Tracking using the Russell 3000 Index . . . . . . . . . . . . . . . . 132


7 Conclusion and Future Research 136

7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

7.2 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

7.2.1 Modelling discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

7.2.2 Algorithm discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

Bibliography 144

A Appendix of Chapter 4 159

A. 1 Numerical example for Heuristic I . . . . . . . . . . . . . . . . . . . . . . . . . . 159

A. 2 Numerical example for Heuristic II . . . . . . . . . . . . . . . . . . . . . . . . . . 160

A. 3 Ticker in S&P500 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

A. 4 Gap by LR and SLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

A. 5 Sector Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

B Appendix of Chapter 5 166

B. 1 The pseudocode for LR sub-solver . . . . . . . . . . . . . . . . . . . . . . . . . . 166

B. 2 The pseudocode for Tabu search sub-solver . . . . . . . . . . . . . . . . . . . . . 168

B. 3 Speed up solving process for sub-problem . . . . . . . . . . . . . . . . . . . . . . 169

C Appendix of Chapter 6 174

C. 1 Parameter generation for the robust tracking model . . . . . . . . . . . . . . . . 174

C. 2 LR gap information (S&P500) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

viii

List of Tables

4.1 Model test by Gurobi (q = 10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2 Time comparison for updating dual in LR method . . . . . . . . . . . . . . . . . 46

4.3 Parameter Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.4 Sharpe ratio for out-of-samples (2007.01 - 2008.01) . . . . . . . . . . . . . . . . . 61




5.1 Model Comparison - with and without transaction cost term . . . . . . . . . . . 71

5.2 LR method and Gurobi Comparison - instance 1 . . . . . . . . . . . . . . . . . . 79

5.3 Computational result (N=50, K=5, S=15) - instance 1 . . . . . . . . . . . . . . . 81

5.4 Computational result (N=100, K=10, S=3 - instance 2) . . . . . . . . . . . . . . 81

5.5 Computational result (N=100, K=10, S=10 - instance 3) . . . . . . . . . . . . . 81

5.6 Computational result in literature . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.7 Parameter setting for the model and PH algorithm . . . . . . . . . . . . . . . . . 86

5.8 Bound details under different methods for S=15 . . . . . . . . . . . . . . . . . . . 86




5.12 Numerical result (N=100, K, S=15) . . . . . . . . . . . . . . . . . . . . . . . . . 89




ix

5.16 Test different ratios (N=100, K, S) . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.1 R2 value for the regression models . . . . . . . . . . . . . . . . . . . . . . . . . . 113

6.2 Ticker symbol across Sectors (SP100) . . . . . . . . . . . . . . . . . . . . . . . . 115

6.3 The average TE/TC ratios under different size . . . . . . . . . . . . . . . . . . . 125

6.4 Tracking ratio comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.5 Bounds information (SP500) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.6 Bounds information (Russell 1000) . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.7 Bounds information (Russell 3000), TE=4STD . . . . . . . . . . . . . . . . . . . 133

6.8 Bounds information (Russell 3000), TE=3STD . . . . . . . . . . . . . . . . . . . 134

A.1 Ticker symbol across Sectors (SP500) . . . . . . . . . . . . . . . . . . . . . . . . 162

A.2 Gap between LB and UB, 2006-2007 . . . . . . . . . . . . . . . . . . . . . . . . . 164

B.1 LR method and Gurobi Comparison - instance 2 . . . . . . . . . . . . . . . . . . 167



B.4 LR under different iteration number . . . . . . . . . . . . . . . . . . . . . . . . . 169

B.5 Tabu search under different (L, iter number, M) . . . . . . . . . . . . . . . . . . 170

B.6 LR and Tabu comparison (N=100, K=10, S=15) . . . . . . . . . . . . . . . . . . 171





C.1 Bounds information (SP500) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

x

List of Figures

1.1 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 Efficient frontier with and without cardinality constraint . . . . . . . . . . . . . 17

3.1 Lagrangian Decomposition Scheme for integer programs . . . . . . . . . . . . . . 27

4.1 Gap Comparison between LR and SLR . . . . . . . . . . . . . . . . . . . . . . . 51

4.2 Norm of sector differences between constructed portfolio and S&P500 . . . . . . 53

4.3 Sector diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.4 Comparison of Performance – optimal objective value . . . . . . . . . . . . . . . 56

4.5 Comparison of Performance – portfolio return . . . . . . . . . . . . . . . . . . . 57

4.6 Comparison of Performance – portfolio variance . . . . . . . . . . . . . . . . . . 58

4.7 Comparison of Performance – portfolio Sharpe ratio . . . . . . . . . . . . . . . . 60

4.8 Comparison of Performance – Tracking Ratio of out-of-sample period (2007,

2008) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.9 Comparison of Performance – Tracking Ratio of out-of-sample period (2009,

2011) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.1 Network structure with cardinality at stage 0 and 1 . . . . . . . . . . . . . . . . 68

5.2 Equivalent scenario trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.3 Running time of PH method for different problems . . . . . . . . . . . . . . . . 91

6.1 Portfolio return vs TE with different q under different σ (SP100) . . . . . . . . . 105

6.2 Portfolio variance vs TE with different q under different σ (SP100) . . . . . . . . 105

6.3 Portfolio Sharpe ratio vs TE with different q under different σ (SP100) . . . . . . 106

xi

6.4 Robust bound for expected return and variance (SP100) . . . . . . . . . . . . . . 116

6.5 Wealth evolutions for rolling out-of-samples . . . . . . . . . . . . . . . . . . . . . 117

6.6 Model comparison - portfolio return . . . . . . . . . . . . . . . . . . . . . . . . . 119

6.7 Model comparison - portfolio variance . . . . . . . . . . . . . . . . . . . . . . . . 120

6.8 Model comparison - portfolio Sharpe ratio . . . . . . . . . . . . . . . . . . . . . 121

6.9 Model comparison - Tracking error . . . . . . . . . . . . . . . . . . . . . . . . . . 122

6.10 Tracking Error to Transaction costs ratios (SP100) . . . . . . . . . . . . . . . . . 123

6.11 TE/TC ratios with respect to the trading ratio α . . . . . . . . . . . . . . . . . . 124

6.12 Model comparison - Tracking ratio . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.13 Iteration details (SP500) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.14 Bounds and gap comparison by LR method (SP500) . . . . . . . . . . . . . . . . 130

6.15 Gurobi iteration details for different size q . . . . . . . . . . . . . . . . . . . . . 132

A.1 Portfolio allocation in sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

xii

Chapter 1

Introduction and Thesis Outline

1.1 Background of Portfolio Optimization

Making a trade-off between the expected rate of return and variance of the rate of return for a

portfolio is at the heart of mean-variance optimization (MVO). MVO was initially established

by Markowitz in 1952 [107] and provides a foundation for single-period investment theory.

The MVO framework offered a rigorous risk management tool for investors and inspired the

subsequent Capital Asset Pricing Model (CAPM) in the 1960s [135] and the concept of the

Sharpe ratio [134] that can be used to appraise portfolio performance. According to the MVO

and the CAPM, risk-averse investors only need to determine their budget allocation to a single

fund of risky assets and the risk-free asset to achieve efficient portfolios (see the one-fund theorem

in [104]). The single master fund usually refers to specific market indices because theoretically

one cannot find a single fund that include all assets in the world, and practically typical indices

have relative long-term outperformances than that of the active investments. For example,

Zenios reported that the average return of 769 all-equity actively managed funds was 2% to 5%

lower than the S&P 500 index during the period 1983 – 1989 [151]. More recently, Standard &

Poor’s Scorecard has reported that from the 5 years and 10 years before Dec. 31, 2014, more

than 88% and 82% of actively managed large-cap funds were outperformed by the S&P 500,

respectively [1]. These evidence show that tracking benchmark portfolios as closely as possible

is an efficient representative of the one-fund theorem. Therefore, exchange-traded funds (ETFs)

that replicate the market indices increased exponentially since the 1990s. The proliferation and

1

Chapter 1. Introduction and Thesis Outline 2

demand of market index ETFs such as the SPDR S&P 500 Index ETF is a reflection of the

demand in investment in broad markets as opposed to actively managed investments that try

to beat the markets. ETFs allow a broader participation in investment in major market indices

since it is the ETF company that is responsible for replicating an index, i.e. investing to mimic

the risk and return profile of a market index. A key strategic decision of an ETF company is

the construction of a portfolio that mimics a given benchmark market index. However, this is

not a trivial task and is often referred to as index tracking.

An index based ETF attempts to reproduce the performance of a specific index by holding all

constituents of the index and trading less frequently, e.g. one or two times a week. To perfectly

mimic the target portfolio, all assets in the benchmark are held in the quantities specified by the

weightings of the benchmark portfolio. The full replication strategy inherently diversifies the

allocation across the entire benchmark index. However, full replication is not practical given

the transaction costs this would entail. For example, fully replicating the S&P 500 index would

require holding the 500 assets along with weightings for each asset. The weightings are based on

market capitalization and change constantly based on the asset prices. Constant re-balancing

of the tracking portfolio would result in a prohibitive number of transactions. Also, certain

stocks in the index with small market-cap weights have to be held in full replication portfolios,

which will result in illiquidity and especially be undesirable for tracking small-cap indices. To

overcome these issues, an alternative strategy is to select a strict subset of assets from the

benchmark and match the benchmark as closely as possible, and obviously tracking errors

between the tracking portfolio and benchmark index will be generated. Practically, cardinality

constraints that restrict the portfolio as a subset of the assets constitute the index are crucial

for implementing the partial replication strategy.

A tracking portfolio with fewer assets can avoid the small fraction holdings and reduce

transaction costs compared with the fund who purchases all of the stocks that make up the

index. In addition, the tracking portfolio with the cardinality constraint can simplify the com-

plexity of asset management and reduce administrative overhead and administration costs.

However, several challenges need to be considered. First, it is not easy to keep a stable and

robust tracking portfolio as the movement of the index is unpredictable under uncertain mar-

ket environment. Secondly, tracking large indices with different practical constraints usually


encounters the bottleneck of solvability. This thesis aims to explore and construct different

cardinality constrained index tracking models using optimization fashion and therefore test the

one-fund theorem empirically.

It is well known that estimator errors for parameters in portfolio selection models can

affect the optimal portfolio significantly. Many approaches have been proposed in the literature

to prevent under- and over-estimation of parameters and then to enhance the robustness of

the solution structure. Recourse-based stochastic programming [24] is a prevalent tool for

immunizing against estimator errors. In this approach, a recourse decision is obtained in the

second stage to compensate for the effects of the first-stage decision that is fixed ahead for

given uncertainty sets. For example, Asset Liability Management (ALM) is an investment

strategy that covers the liability over a multi-period horizon [152]. Financial Planning model

is another classical topic in financial optimization that uses the network flow structure to

match anticipated deposits and liabilities under different future scenarios through multi-stage

stochastic programming [115]. Robust optimization is one alternative to immunize against

parameter uncertainty and is particularly suitable for portfolio selection models in which risk

controls are heavily involved [13]. The strategy of robust optimization refers to the use of a

finite worst-case scenario to represent the infiniteness of the uncertainty set while maintaining

the same level of complexity as a nominal problem. Moreover, the adaptive features of robust

optimization allow us to conveniently merge other techniques such as a factor model, i.e., factor-

based MVO selection [66]. Another important optimization stream for portfolio selection is to

apply the idea of Value at Risk (VaR) and then Conditional Value at Risk (CVaR) in which

greatest concern is on the default risk of an investment. CVaR selection models [126] have

received additional attention since they are convex like the MVO model. In this dissertation,

we primarily apply the stochastic programming and the robust optimization approaches to

study the issue of parameter uncertainty for different index tracking models.

Portfolio selection models also have been developed in Operational Research with many

different types of practical constraints including buy-in threshold, turnover, tracking error, sec-

tor limit, cardinality, and round-lot constraints. Cardinality constraints draw special attention

from academics and this thesis not only because they are key to solving index tracking prob-

lems but also because they increase the complexity of solving the problem due to the binary


requirement in the model. With the rapid development of computer science and operations

research in the last two decades, one can efficiently obtain the optimal portfolio from a large se-

lection pool through MVO-based models within a reasonably short time. Although polynomial

iteration-complexity algorithms, e.g. interior-point based methods, are available to large-scale

MVO problem since the 1990s, a key practical issue to portfolio managers is that the optimal

portfolio allocation may concentrate on a few assets which may result in high portfolio risk, or

diversify too broadly and lead to high transaction costs. Thus additional trade-off is between

the portfolio size, risk, and managemental cost arises to investors and ETF companies. One

way to implement this trade-off is to use cardinality constraints but obtaining the associated

solution is non-trivial.

Typical solution methods for cardinality constrained portfolio selection in the existing lit-

erature can be categorized into two main groups. The first group of methods mainly focuses

on cut generation for branch-and-bound algorithms [30, 22] or relies on heuristics designed to

satisfy cardinality constraints [10]. The second group either reformulates binary variables as

a set of conic constraints or reconstructs the cardinality constraints into a non-convex SDP,

and employs the semidefinite relaxation to approximate the non-convex programs [123, 33].

Meanwhile, software packages using branch-and-bound methods are currently available to han-

dle mixed integer conic programming, e.g. SeDuMi [140], MOSEK [113], CPLEX [42], and

GUROBI [71]. Their solutions are usually used as benchmarks by researchers who propose new

methods. For example, we mainly compare the solutions generated by our proposed Lagrangian

methods for different partial tracking models with that from the Gurobi mixed integer solvers

in this thesis.

Many companies that offer ETFs to the open public are large financial institutions that will

invariably use portfolio management systems e.g. computer-based decision support to assist in

construction of (tracking) portfolios in modern finance [149]. In particular, optimization-based

decision support can be even more relevant for portfolio optimization where in addition to

database and statistical modules, an optimization module is present that contains mathematical

models and algorithms [17]. But a central challenge for any optimization-based decision support

is to have mathematical models that not only can track a given benchmark well, but that can

also be solved within a reasonable amount of time [138].


1.2 Research Objective and Contribution

The main objective of this thesis is to demonstrate that portfolio selection via tracking typ-

ical indices are crucial for risk management in investment science. Studying and mimicking

the indices is a key step to obtain the efficient portfolio. Meanwhile, with the mathematical

and computational developments, more practical restrictions can now be incorporated, and the

financial engineering trend to select portfolio is more prevalent and applicable. In this disser-

tation, two types of well-known financial problems are introduced, modelled realistically, and

solved efficiently. We sketched and generalized these financial problems in terms of risk control

through advanced mathematical programming. The designed models are accompanied by de-

composition algorithms which overcome computational challenges that have prevented previous

attempts.

The contributions of this thesis can be described from two perspectives. First we developed

three financial models:

� A cluster-based approach for index tracking. A tracking portfolio model that includes

practical constraints controlling the portfolio size, the buy-in thresholds, the transaction

costs for re-balancing, and the sector centralization.

� A two-stage cardinality constrained financial planning problem with a network flow struc-

ture. The designed portfolio model not only contains the constraints that limit the size

of the portfolio, the buy-in thresholds, and the transaction cost of cash-flows, but also

considers asset return uncertainties via an advanced Stochastic Programming approach.

A financial planning framework that extends to index tracking is also examined.

� A factor based robust index tracking model which considers a three-dimensional trade-off

between the portfolio return, portfolio risk (e.g. variance and tracking error), and portfolio

size. The robust factor model takes account of uncertainty in the assets’ expected return

and variance. The designed model can be captured by a general cardinality constrained

conic framework.

The three above investigations encompass several important characteristics of portfolio de-

sign such as portfolio size, sector diversification, re-balancing and transaction costs, and consid-


eration of uncertainties associated with future circumstances of financial markets or investors’

goals. These developed models combine different risk control tools for portfolio selection. These

realistic and sophisticated modelling techniques are highlighted and useful with respect to the

market environment through in-sample and out-of-sample analyses. To overcome the large-scale

computational difficulties associated with the solution process of these models, we summarize

our promising Lagrangian decomposition strategies as follows:

� Lagrangian and Semi-Lagrangian relaxation methods to decompose the clustering tracking

models across different sectors. A variable neighborhood search heuristic using the LR

bound information is embedded into the LR framework to yield a near-optimal solution.

� Progressive Hedging which decomposes the cardinality constrained financial planning

models across different scenarios. Tabu search and LR methods are designed to quickly

solve the hard sub-problems.

� A Lagrangian relaxation method to decomposes the factor-based robust index tracking

model across different variable space.

The proposed solution methods that solve state-of-the-art financial problems, and the effec-

tiveness of the modelling techniques relevant to the developing field of portfolio optimization

have been studied and provided in this dissertation.

1.3 Thesis Outline

The rest of the thesis is organized as follow: In Chapter 2 we present a literature review of

MVO-based portfolio selection models, and then a literature review of index tracking models

and its extension. In Chapter 3 we briefly review different types of algorithms for cardinali-

ty constrained selection models and draw attention to Lagrangian relaxation methods in the

literature for financial problems. In Chapter 4, we consider various characteristics of a not

well-known index tracking model and design a Lagrangian based algorithm to approximate

high-quality solutions. In Chapter 5, we present a network structure financial planning frame-

work with cardinality constraints that captures various sources of uncertainty through a mixed

integer stochastic program with recourse. In Chapter 6 factor-based robust index tracking


is generalized by the proposed cardinality constrained conic program which can be efficiently

solved via the proposed Lagrangian algorithm. Chapters 4 to 6 display in-sample and out-of-

sample test results that focus on the real financial market, which form the backbone of the

thesis. Finally, we conclude our work and discuss future research directions in Chapter 7. We

display the thesis structure in the following Figure (1.1):

Introduction Ch1

Portfolio optimizationand extension Ch2&3

prac

tica

l

cons

trai

nts

uncertainty

algorithms

Chapter6

Chapter5

Chapter4

Conclusion anddiscussion Ch7

Figure 1.1: Thesis Structure

As shown in the Figure (1.1), the structure of the thesis can be unified from three points

of views. First we construct the tracking portfolios via a predominant model and different

alternatives. The goal of these investigations in Chapters 4 to 6 is to illustrate and prove

the effectiveness of the one-fund theorem in modern finance [104]. Secondly, we implement

these index tracking approaches through cardinality constraints and therefore lead to NP-

hard problems. Therefore, methodologically we unify these projects via a dual decomposition

framework that integrates different metaheuristics. We list a more detailed overview for each

chapter as follows.


Chapter 2 - Modern Portfolio Theory and Index Tracking

We comprehensively review the history of the Mean-Variance Optimization (MVO) model and

its extensions in this chapter. Many researchers have proposed modification to the MVO frame-

work after the introduction of Harry Markowitz’s Mean-Variance Optimization (MVO) model

in 1952. We examine these models through a literature review of the current approaches to

portfolio selection, and define important characteristics relevant to this thesis. In particular,

we survey different index tracking problems such as enhanced indexation and approaches that

incorporate the parameter uncertainty in the literature.

Chapter 3 - Lagrangian Relaxation in Literature

We provide a history of the application of the Lagrangian approach to different management

problems, especially relative to the problems in financial optimization. We then explain the

mechanism of the dual decomposition through a simple numerical example and review major

variations of LR methods in the literature. We point out that LR methods are crucial for

solving index tracking problems not only because the metaheuristics can be easily embedded

into the dual decomposition scheme but also as the bound information can be used to quickly

generate high-quality solutions.

Chapter 4 - A Constrained Clustering Approach for Index Tracking

We consider the problem of tracking a benchmark target portfolio of financial securities, in

particular the S&P 500. Linear integer programming models are developed that seek to track

a target portfolio using a strict subset of securities from the benchmark portfolio. The mod-

els represent a clustering approach to the selection of securities and also include additional

constraints that aim to control risk and transaction costs. Lagrangian and semi-Lagrangian

methods are developed to compute solutions to the tracking models. The computational re-

sults show the effectiveness of the linear tracking models and the computational methods in

tracking the S&P 500. Overall, the models and methods presented can serve as the basis of an

optimization-based decision support model for creating tracking portfolios.

Chapter 5 - Progressive Hedging for Cardinality Constrained FP Problem

Cardinality constrained Financial Planning (FP) problems are described using a network flow

structure in this chapter. We outline how the special characteristics of this structure can be


used to fully encompass a comprehensive set of real-world portfolio elements and considers mar-

ket uncertainties. The network structure cardinality constrained Financial Planning problem is

formulated as a Stochastic Mixed Integer Program (SMIP). The proposed FP framework can

be naturally extended to an index tracking problem. We apply a dual decomposition method,

Progressive Hedging (PH), to efficiently accommodate instances with large numbers of scenar-

ios. Solving the scenario sub-problems is crucial for the proposed PH algorithm. Therefore,

Lagrangian relaxation and Tabu search methods are designed for handling the scenario sub-

problem, and numerical results show that our sub-solver reduce the solving time significantly

compared with the time information by Gurobi. Moreover, a Lagrangian lower bound was

embedded into the PH method and, as a result, better gap information is obtained compared

with the gap obtained by Gurobi.

Chapter 6 - Lagrangian Relaxation for CCCP

We study a class of Cardinality Constrained Conic Programming (CCCP) that is suitable for

the robust index tracking problem in this chapter. A robust version of the Fama-French three

factor model is developed whereby uncertainty sets for the expected return and factor loading

matrix are generated. The resulting model is a mixed integer second-order conic problem.

Computational results in tracking the S&P 100 out-of-sample show that the robust model can

generate portfolios that have a better tracking error and Sharpe ratio than those generated by

the nominal model. We then present a method to approximate the optimal solution by using the

bound information generated from its Lagrangian dual. This strategy allows us to decompose

the CCCP into two easier subcases and calculate a tight lower bound and feasible upper bound

quickly. Meanwhile, sub-gradient cut and fully regular cuts are obtained to exclude sub-optimal

points that have been explored in previous iterations. Computational results in tracking the

S&P 500 and Russell 1000 show that the proposed method has practical effectiveness for the

class of CCCP problem we are addressing.

Chapter 7 - Conclusion and Future Research

We summarize the conclusion and the findings of the models we investigated in Chapters 4 to 6.

The results that we present in this thesis enhance the applicability and adaptation of portfolio

optimization in finance. We describe future research directions relevant to the fields of finance,


optimization, and computer science. We also discuss alternative models and methodologies that

can be used as points of comparison for with our current work.

Chapter 2

Modern Portfolio Theory and Index

Tracking

From the one-fund theorem [104], we know that any efficient portfolio can be expressed as a

combination of a single master fund and a specific risk-free asset. That is, we can obtain all

different efficient points via changing the weighting between these two assets, and measure the

risk of the market. However, the single master fund is not perfectly available as it requires

the fund contains an asset set as large as possible, ideally includes all the assets in the world.

In practice investors usually represent the single master fund by using different typical market

indices in different countries such as S&P 500 (USA), DAX 100 (German), the Hang Seng

(Hong Kong), FTSE 100 (UK), and Nikkei 225 (Japan). These market indices generally consist

of excellent companies in associated countries and regions and have good enough performance,

and thus are adopted by different investors. For example, risk-averse investors are more prefer

to allocate most of their budget on bond indices, while aggressive investors may mainly use

stock indices as their benchmark. Also, the performance of an index can affect the decision

that whether to invest the foreign market since the index reflects the economic fundamentals

of the country. Therefore, although the single master fund seems hard to obtain theoretically,

it is possible to approximate the single fund by combining and replicating different indices.

Thus efficiently replicating an index is very important to investors and ETF companies. As

mentioned in Chapter 1, the strategy of full replication that holds all of the stocks in the same

proportions as in the index has a number of disadvantages. For instance, the ineffectiveness to

11

Chapter 2. Modern Portfolio Theory and Index Tracking 12

purchase and hold very small fractions of certain stocks, high transaction costs of rebalancing

all the positions in the index, and the illiquidity of certain stocks for tracking small-cap indices.

Cardinality restriction to the replication process, on the other hand, partially mimic the index

but can overcome these issues. Based the MVO and the CAPM, superior risk-adjusted returns

are impossible to obtain in an efficient market, and investors only need to follow and replicate

the market indices. The goal of this thesis is to support the one-fund theorem and illustrate

that partial replication through professional and advanced tracking models are crucial in modern

investment science. Specifically, we study three types of index tracking models with cardinality

constraints.

� First we develop a cluster-based approach for tracking based on a model of Cornuejols

and Tutuncu [40]. The cluster-based tracking models avoid using the first moments in-

formation, i.e. expected return µ, which are hard to estimate, and keep the problem as a

linear mixed integer optimization programs. Numerical result for tracking S&P 500 show

the alternative approach is a powerful tool to construct tracking portfolios.

� In the second approach, we first incorporate the cardinality constraints to a Financial

Planning model by Mulvey and Vladimirou [115], then we extend the network structure

framework to index tracking problem. Numerical results show that establish alternative

can track S&P 100 successfully under numerous scenarios of the expected returns.

� Finally, we consider the cardinality constraint to a traditional MVO-based tracking model,

and develop it to a cardinality constrained robust factor-based enhanced-index tracking

model via building the robust counterparts for the tracking error and portfolio risk con-

straints. Numerical results based on S&P 100 show the enhanced ability of the robust

portfolios in terms of tracking error and Sharp ratio compared with those generated by

the nominal model.

Of course, there are different tracking models tailored for indices replication problem which

have been extensively developed in the last decade. To clearly see the main development of

the modern portfolio theory, we first review the Mean-Variance Optimization (MVO) selec-

tion model and its broad extensions, then we focus on the cardinality constrained selection

approaches, primarily index tracking models, in the literature. Since the cardinality constraints


increase the complexity of obtaining the tracking portfolio, we will also review the algorithms

used in the literature in next chapter.

2.1 Literature review for MVO and Its Extension

The goal of investing different tradeable financial instruments in the market is to maximize profit

for a given tolerance of loss on his balance sheet. A tradeable financial instrument, e.g. bond,

stock, is a legal agreement carrying monetary value and can be circulated between different

investors. The process of determining and combining of the weights of the selected securities

is called portfolio selection, which leads to a portfolio with lower risk than the assets that

compose it when taken individually as these assets are usually affected in opposite directions

by unpredicted future events and partial of risk can offset each other. The MVO selection

model by Markowitz in 1952 [107] is the first systematic and quantitative treatment that take

into account the balance of portfolio return and risk.

Suppose that there are n risky asset can be selected. Let ri be the random return of asset

i, the expected return of asset i is µi, and the covariance between assets i and j is σij , then

for a given weight x the portfolio return rp =n∑i=1rixi, the expected return of the portfolio

µp =n∑i=1µixi, and the portfolio variance is expressed as:

σ2p = E

[(rp − µp)2

]= E

( n∑i=1

rixi −n∑i=1

µixi

)2

= E

( n∑i=1

(ri − µi)xi

) n∑j=1

(rj − µj)xj

= E

n∑i=1

n∑j=1

(ri − µi) (rj − µj)xixj

=

n∑i=1

n∑j=1

σijxixj

(2.1)

Portfolio variance in (2.1) gives an intuitive and quantitative measure to the loss of an

investment. The remained task is to determine the proportion of the wealth to each asset, thus

in MVO framework the optimal portfolio weight x∗ is generated by solving following quadratical

model:

min

n∑i=1

n∑j=1

σijxixj (2.2)


s.t.

n∑i=1

µixi ≥ R, (2.3)

n∑i=1

xi = 1, (2.4)

lbi ≤ xi ≤ ubi, ∀i = 1, · · · , n (2.5)

where lb, ub are the lower and upper bounds of the proportion to asset i. lb ≥ 0 denotes

the short selling is prohibited. A brief story to above model is that one wants to achieve a

portfolio with minimum loss i.e. objective (2.2) with designed return i.e. constraint (2.3) under

limited budget i.e. constraint (2.4) and (2.5). Finding a solution to the basic MVO model is

trivial because of the fact that the covariance matrix always is positive semi-definite (PSD). The

efficient frontier which represents a trade-off between portfolio return and risk is produced by

generating the corresponding variance under the designed portfolio goal R, see red-circle curve

in Figure (2.1). The adaptable properties of the basic MVO allow people to develop the model

along various directions. The first influential consequence is what is known as Capital Asset

Pricing Model (CAPM), which is a collision between the MVO and factor models, was primarily

developed by Sharpe [135], Lintner [101] and Mossin [114] in the 1960s. The factor based MVO

model keeps inspiring many researchers to explore suitable factors to interpret the connection

between the market and assets. For example, Fama and French [54] extended the CAPM model

based on the observation that small-capitalization stocks and value stocks (i.e. stocks with a

high book to price ratio) tend to outperform the market as a whole. In the model, three risk

factors reflect the sensitivities of each stock to the market excess return (market factor), the

excess of value stocks over growth stocks (book-to-market factor), and the excess of small-cap

stocks over large-cap stocks (size factor). Black and Litterman [25] used the prior observations

of the market equilibrium (market factor) and investor’s views (confidence factor), and applied

the Bayesian inference to adjust the mean and variance to build a robust coefficient for MVO

model. Burmeister, Roll, and Ross [28] presented a macroeconomic factor model that considers

five risk terms, which are the investor confidence, interest rate, business cycle, inflation and

market index, in interpreting the historical stock returns. It turns out that these models

explain the cross-sectional variation in asset returns fairly well. Contemporaneously, Fama et

al. [55] pointed out that the market can adjust new information to the asset price rapidly,

which offers a strong evidence for the efficient market hypothesis. Many articles then further


demonstrated that the asset price is unpredictable over a short term but may be forecasted by

regression analysis in a long run, see [53, 132, 73]. Therefore, the CAPM model suggests that

every efficient portfolio should be priced at an equilibrium where a weighted linear combination

of the market and the risk-free asset is obtained. This conclusion gives rise to a prominent

application i.e. index fund or index tracking in modern finance.

Sharpe and Markowitz shared the Nobel Memorial Prize in Economic Sciences in 1990 due

to their distinguished work on portfolio allocation and asset pricing, and Fama, Hansen, and

Shiller shared the Nobel Memorial Prize in Economic Sciences in 2013 because of their initial

finding and contribution to an understanding of long-term market behaviour which is used as

theoretical and empirical support for constructing and tracking indices.

Some researchers seek to simplify the basic MVO model in terms of the computational

complexity or risk measurement. For instance, Konno and Yamazaki [94] found that the MVO

model can be converted into a Mean-Absolute Deviation (MAD) model under the condition

that the asset returns follow the multivariate normally distribution. Besides MAD framework,

VaR and CVaR are important alternative measures for risk management, and associated VaR

and CVaR models are also prevalent in the literature. VaR measurement was firstly applied by

the Basel Committee on Banking in 1996 and then broadly adopted in the financial industry.

Unlike the MVO model, which adopts the symmetric risk measurement for portfolio, VaR and

CVaR constraints mainly measure the downside loss of an investment. Since the VaR constraint

lacks sub-additivity property and may result in local minima, Rockafellar and Uryasev [126]

proposed a CVaR model which captures the average loss to evaluate the credit risk of a portfolio.

Both MAD and CVaR models are linear programs which can be efficiently solved for large-scale

applications.

The basic MVO allows people to incorporate different practical constraints into the selection

procedure, which consists of the second extensional stream. Some typical constraints in practice

are described as follows:

� Buy-in threshold constraint which is used to avoid small fraction investment in the port-

folio. This constraint can be implemented by adjusting the values of lbi and ubi for asset

i in constraint 2.5.


� Turnover constraint which is applied to limit the transaction cost for the portfolio con-

struction or re-balance. The most common mathematically implementation is expressed

as a linear turnover form,n∑i=1

∣∣xi − x0i

∣∣α ≤ γ, in which x0i denotes the initial portfolio

weight, α denotes the unit trading cost and γ denotes the trading budget. This type of

constraint can be convexified through convex it into equivalent set of linear constraint

(see details in Chapter 4).

� Tracking error constraint is useful for index fund manager who is interested in a compari-

son or small outperformance with a specific benchmark such as S&P 500. This constraint

can be formulated asn∑i=1

n∑j=1

σij (xi − xiB) (xj − xjB) ≤ TE, where xB is the weights of the

benchmark. We will investigate this constraint in Chapter 6.

� Cardinality constraint used to control the portfolio size via introducing new binary vari-

able y and modifying the constraint (2.5), is expressed as:lbiyi ≤ xi ≤ ubiyi, ∀i = 1, · · · , nn∑i=1yi = q

yi ∈ {0, 1} , ∀i = 1, · · · , n

(2.6)

� Round lot constraint is designed to improve the liquid of the portfolio through dividing

the trading shares into small blocks. One can add the following equation into the MVO

framework:

xi = zifi = piziMC , ∀i = 1, · · · , n, where zi ∈ Z is an integer number of rounding lots, fi

be fraction of the portfolio wealth, pi denotes the trading price of asset i, M denotes the

round lots, and C denotes the total portfolio wealth.

� Chance constraint is used to measure the downside risk of an investment. Mathematical

expression can be wrote as Pr(µTx ≥ β

)≤ 1− α, where β is the psychological threshold

to a portfolio performance e.g. maximal loss, and α denotes the confidence level.

Besides the popular restrictions previously mentioned, we show that sector limit constraint

considered in Chapter 4 is also a useful way to diversify the portfolio across sectors. The

basic MVO (2.2) - (2.5) with buy-in threshold constraint, turnover constraint and tracking

error constraint remains the convex property so it can be efficiently solved by interior point


based algorithms. In contrast, the combination of the basic MVO with cardinality constraint

and round lot constraint becomes a quadratic mixed integer programming. Although integer

requirement changes the problem to be NP -hard, there are explicable benefits behind these

constraints. For example, although the cardinality constraint destroys the smooth of the efficient

frontier, such restriction can replicate the efficient portfolio with cheaper cost. One example

depicted in Figure (2.1) illustrates this idea. Assume that we select 2 out of 4 assets to build

the portfolio, the short selling is allowed. We draw all efficient frontiers for any 2 assets picked

which are represented by the dash lines, and take a fractional piece from each EF to sketch

the whole efficient frontier under portfolio size that equals 2 i.e. the black-start curve. It is

clear to see that the original EF (red-circle curve) only have one capital market line for a given

risk-free asset while the EF with the cardinality constraint may draw different tangle lines in

different ranges for the same given risk-free asset. One observation is that we can efficiently

approximate the market (q = 4) with a smaller size portfolio (q = 2), e.g. R ≤ 6%. This

example also illustrates the idea of index tracking.

Figure 2.1: Efficient frontier with and without cardinality constraint

Many articles in the literature offer alternative insights into different practical constraints.


Konno and Kobayashi [93] constructed a reliable stock-bond portfolio via integrating different

asset classes into MVO. Adcock and Meade [4] proposed a pure MVO-based portfolio selection

model with transaction cost constraints and applied an efficient algorithm that quickly generate

the optimal solution. Jobst et al. [84] studied the MVO model with buy-in threshold constraint,

round-lot constraint and cardinality constraint in a whole model, and examined the effect of

these constraints on the changing efficient frontiers.

One key issue of the MVO model is that the optimal portfolio is extremely sensitive to the

estimated parameters i.e. expected returns and covariances between assets [36]. That is, a tiny

amount of changing in expected return or covariance derive from a short-time price movement

will result in significantly different portfolio allocations. For example, Tutuncu and Koenig

[144] demonstrated that the efficient frontiers under nominal inputs can be drastically changed

within only 5 percentiles for means of monthly log-returns and covariances of these returns.

Chopra and Ziemba [36] showed that estimated errors in the expected returns are 9 – 12 times

more important than errors in covariances, which indicates any small increase in covariance

matrix may amplify the portfolio Sharpe ratio 10 times. Since the MVO framework involves

the estimate of asset return and variance, it is believed that the estimation errors will also affect

the optimal portfolio significantly.

To address this issue, another important stream of MVO extension has been explored in

Operational Research that focuses on finding stable portfolios that are immune to uncertainties

over time. This stream is referred to as multi-periods portfolio selection. Hakansson [72] found

that the variance of the efficient portfolio over multi-periods is irrelevant to the return under

the transformation of a suitable utility function. His findings became the basis of the portfolio

choice theory. Therefore, many investment problems only focus on dealing with uncertainty for

expected return of asset over multi-period horizon by using stochastic programming with re-

course, e.g. Asset Liability Management (ALM) and Financial Planning problems we discussed

in Section (1.1) in Chapter 1. In stochastic programming, a recourse decision is obtained in

the second stage to compensate for the effects of the first-stage decision that is fixed ahead for

a given uncertainty set. One main drawback of applying the stochastic program to the MVO

model is that the number of scenario for a small size uncertain set of expected return may

be innumerable, and lead to a large-scale problem which may encounter the solvability issue.


Thus, there exist other methods that take account of both first and second central moment in-

formation and meanwhile maintaining the tractability for multi-periods MVO selection. Robust

programming is one of the alternative methods capable of achieving these goals.

Robust optimization has been considered in many applications to mitigate the effects of

parameter uncertainty. A comprehensive survey (over 130 references) of robust optimization is

given in [19]. The authors listed several important applications in finance, which include multi-

period asset allocation problem as in Ben-Tal et al. [12] where the authors propose a second-

order cone program as a robust counterpart, and Bertsimas and Pachamanova [21] where under

specific norms the problem is cast as a linear program. Goldfarb and Iyengar in [66] considered

robust mean-variance optimization formulations based on robust factor models and show that

the resulting robust problems can be formulated as Second Order Cone Programming (SOCP),

which is one category of convex problem. Erdogan, Goldfarb, and Iyengar [51] incorporated

transaction costs into the robust MVO problems and the resulting model remains as an SOCP.

Cardinality restrictions to robust portfolio selection have also been studied. Sadjadi et al. [131]

applied robust optimization to cardinality constrained Mean-Variance problem which resulted

in a mixed-integer second-order cone programming and applied genetic algorithms to compute

solutions. Nalan et al. [64] also used robust cardinality constrained MVO problems and solved

the resulting mixed-integer SOCP instances using a commercial solver. We then review the

index tracking problem in the literature in next section.

2.2 Literature review for Index Tracking

A market index is a representation of entire market which combines typical top performing

constituents together to an aggregate value. Security market indices are useful tools that help

investors track the performance of various specific markets, estimate risk, and evaluate the

performance of portfolio managers. The value of a market index can be calculated by different

methods, such as market capitalization weighted, price-weighted, and equal-weighted. Market

capitalization weighted method is a traditional and predominant approach to measuring an

index. For example, S&P500 is a market-cap based American stock index which contains 500

large companies traded the US public market. These companies are picked from 10 sectors which


are measured by specific sector indices [1]. Almost all important markets adopt the market-cap

weighted method to construct their indices in the world today. These typical examples also

include S&P/TSX Composite Index that contains over 220 of the largest Canadian securities,

Russell 3000 Index represents over 98% of the investable US equity market in terms of market

value, and Nasdaq Composite Index which is heavily weighted towards information technology

sector. Price weighted method, on the other hand, puts more weight on the stock with a higher

price and reflects the investor’s confidence about the economy. A notable example is the Dow

Jones Industrial Average, which clearly records most of the disasters in American economic

history. Besides above two methods, equal weighted index is another primary index weighted

method which assigns index components with equivalent weights. The advantage is that the

tracking portfolio can replicate the target index easily but, on the other hand, it may result in

a high turnover cost.

Because of the impressive average performance over the years, market indices also form a

basis of new financial products such as ETF funds. The index-based ETFs are the primary

category of the ETF funds. On one hand, perfectly yielding exact same returns to the tar-

get’s is a major task of the tracking portfolios, and one the other hand, partial replication

through cardinality constraints are more efficient for practical purpose. Therefore, the consid-

eration of the trade-off between the tracking error and the portfolio size is necessary to portfolio

management. Different tracking error objectives and practical constraints are studied for the

index tracking problem in the literature. Beasley et al. [10] considered tracking error that

minimizes the return differences between the portfolio and the benchmark, and thus leads to

a non-linear tracking model with transaction costs and cardinality constraint to construct the

tracking portfolio in testing five major markets in the world. Bertsimas et al. [20] applied

mixed integer programming to build a portfolio to track a given benchmark portfolio with the

aim of having fewer stocks with limited turnover and transaction costs. Coleman et al. [37]

minimized tracking error based on MVO framework with cardinality constraints and showed

that the developed model is NP-hard. Cornuejols and Tutuncu [40] presented an index tracking

model which maximize the similarity between selected assets and the assets of the target index

and represented a clustering-based approach for constructing a tracking portfolio. Karlow and

Rossbach [87] applied a VaR constraint to the tracking error term, and added a regularization


term into objective instead of using a cardinality constraint.

Recently, the discussion about enhanced indexation arises in the literature. The goal of

enhanced tracking portfolio is to generate a small amount of excess return but keep the same

or similar risk level. This method combines both active and passive management strategies and

thus requires human intelligence to carefully set a parameter trade-off between the tracking

error and portfolio risk. Jorion [85] showed that 83% of the stock-based funds have a higher

risk than the benchmark via using tracking error constraint in MVO framework. Canakgoz

and Beasley [29] considered the enhanced index tracking problem via a mixed integer program

where the objective is to allow outperformance of a benchmark, the model includes transaction

cost and is tested on eight large market indices. Chavez-Bedoya and Birge [32] studied the

enhanced indexation by using a multi-objective non-linear programming approach in which the

variance of the tracking error term can be decomposed for optimal portfolio analysis.

The issue of parameter uncertainty described in Section 2.1 may also be encountered for

index tracking models and has attracted widespread interest from authors. Stoyan and Kwon

[139] developed a mixed integer model which includes several discrete choice restrictions such

as buy-in thresholds, cardinality constraints, as well as round lots to track the Toronto Stock

Exchange (TSX). Kwon and Wu [98] developed a factor-based robust enhanced index tracking

model which take account of both tracking error and portfolio risk constraints and examined the

model by using Fama and French 3 factor model as the basis of constructing robust counterparts

of the nominal tracking model. Lejeune and Samatli-Pac [100] applied a chance-constrained

stochastic integer programming approach that partially considers parameter estimation risk for

enhanced indexation.

Although different tracking models are established, It is still a non-trivial task to obtain

the associated optimal solutions. As mentioned before, cardinality constraint and the binary

requirement make the problem NP-hard and thus it is necessary to review the methodologies

for solving the index tracking problem in next Chapter.

Chapter 3

Lagrangian Relaxation in Literature

In this chapter, we first briefly review numerous algorithms that can be potentially used for

solving our designed index tracking models. Then we illustrate the Lagrangian Relaxation

(LR) mechanism via a simple numerical example and summarize the literature review on the

LR approaches for different types of OR problems and cardinality constrained portfolio selection

models. The LR methods to index tracking problem draw more attention from us.

3.1 Metaheuristics in Literature

The optimal or near-optimal solutions for proposed models are important to decision makers.

To date, there is no polynomial-complexity algorithm for solving large-scale integer program-

ming, the solution strategies to different types of problems highly depend on the intelligence

of designed methods. A heuristic that can generate sufficient good solution to an optimiza-

tion problem in a short amount of time or under limited computation capacity is called a

metaheuristic. Typical metaheuristics for solving mixed integer programming in fields of Op-

erational Research and Computer Science mainly include:

� Greedy heuristic. A greedy algorithm is a problem-solving heuristic which attempts to

make the best optimal choice at each iteration or stage with the hope of leading to a global

optimal solution [39]. The greedy method is powerful tool to solve many hard optimization

problems such as activity-selection problem, p-median problem [96] and scheduling [38].

� Lagrangian Relaxation. Lagrangian relaxation is a useful method that can generate a

22

Chapter 3. Lagrangian Relaxation in Literature 23

compact bound by relaxing the hard constraints and solving the alternative relative easy.

LR methods have been applied different OR problems, e.g. p-median problem, portfolio

optimization problems. A detailed description about LR method will be displayed later.

� Branch and Bound. Branch-and-bound (B&B) algorithms attempt to search the com-

plete space of candidate solutions via excluding large parts of the search space by using

previous generated bounds on the quantity of optimizing the easier sub-problems, e.g. lin-

ear programming relaxation, at each iteration. B&B algorithm is an exact method that

can guarantee optimal solution or prove that no such solution exists for mixed integer

programming. The method was first presented by Land and Doig in 1960 [99] and has

become the most commonly used tool for solving NP-hard optimization problems, e.g.

travelling salesman problem. However, there are evidence show that pure B&B method

usually converges slowly for large-scale discrete problems in practice [146].

� Tabu Search. Tabu Search (TS) is a method that can escape from the local optimum by

using a tabu list to prevent the occurrence of the search to previously visited solutions

and obtain improved neighbors from the current solution. Originally created by Glover

in 1986 [63], TS methods have become an important local search strategy for NP-hard

problems due to the good performance for many classes of the optimization problems

[44, 127, 31].

� Variable Neighborhood Search. Variable Neighborhood Search (VNS) [75] is another type

of metaheuristic method for jumping out from the current local minimum via changing

and exploring the generated various neighborhoods. Despite the mechanism of VNS is

simple and easy to understand, it proves that VNS algorithms can generate good enough

solutions for many NP-hard problems [74, 128].

� Genetic Search. Genetic Algorithm (GA), initially developed by Holland in the 1970s

[79], is a search heuristic for optimization problems that generates global or near-global

optimal solutions by simulating the selection process of natural evolution system. GA is

a fast, useful and reliable technique because that GA can extract the good information

hidden in a solution and pass them to its offsprings (new solutions), and hopefully move


towards the global optimality. Typical applications include p-median problem [81], index

tracking problem [10, 119] and power generation [120].

� Simulated Annealing. Simulated Annealing (SA) is a probabilistic approach for approx-

imating global optimal solution in a large search space for discrete problems. Inspired

from the annealing process in metallurgy, SA algorithms search the optimal solution in

a more extensive space at a probability from a given worse solution [88]. SAs have been

employed to study the OR problems such as portfolio selection problems [31, 45] and

p-median problem [35].

The described metaheuristics above usually borrow the advantages from each other or com-

bine with other techniques such as valid cuts for branch and bound to improve the performance

of the methods according to the special structure of the problems [80, 30]. We follow the same

fashion in which the metaheuristic are combined together to enhance the solving ability. In

next section we primarily focus on Lagrangian relaxation methods because the mathematical

advantage allows different techniques be conveniently embedded into the Lagrangian relaxation

framework for our developed index tracking models. For instance, Variable Neighborhood

Search is used to find the near optimal solution with the help of the Lagrangian dual bound

information in Chapter 4 and Tabu Search and LR methods are applied to solve the scenario

sub-problems to speed up the whole Progressive Hedging algorithm in Chapter 5.

3.2 Literature review for LR and Its Extension

Lagrangian relaxation (LR) is a technique in optimization well suited for problems where the

constraints can be divided into hard and easy constraint sets. In the LR procedure, the hard

constraints are pumped into the objective function with assigned weights or penalties, e.g. the

Lagrangian multipliers, which makes the relaxes alternative easier to solve than the original

problem. Lagrangian relaxation offers a compact bound that can be used to approximate

optimal solution for the problem. Since Lagrangian approximation generally can be decomposed

into a series of sub-problems, LR is also called Lagrangian Decomposition. We illustrate the

idea of Lagrangian relaxation through the following numerical example.


max Z (x) = x1 + x2 (3.1)

s.t. x1 ≤ 2 (3.2)

x2 ≤ 3 (3.3)

0.3x1 + 0.7x2 ≤ 2.5 (3.4)

where constraint (3.4) is relative harder than other two constraints, thus we decompose above

problem into two easier subcases by removing the constraint (3.4) into objective with a positive

multiplier, i.e. L (x, λ) = x1 +x2−λ (0.3x1 + 0.7x2 − 2.5) = (1− 0.3λ)x1 +(1− 0.7λ)x2 +2.5λ.

Then each subcase has analytical solution for relaxed primal problem maxx1≤2,x2≤3

L(x, λ

),

(x∗1, x∗2) =

(2, 3) , 0 ≤ λ ≤ 10

7

(2,−∞) , 107 < λ ≤ 10

3

(−∞,−∞) , λ > 103

which is easier than directly solving original problem Z (x). The updating of Lagrangian mul-

tiplier λ is bounded according to the following weak dual inequality:

minλ≥0

L (x, λ) ≥ Z (x∗)

then we go to the next iteration with new λ until the stopping criteria be satisfied. Lagrangian

dual L (x, λ) is convex and thus it is useful for solving non-convex problem through iteratively

reducing the gaps between the lower and upper bounds.

Lagrangian relaxation for integer programming was initially discussed by Geoffrion [61],

Geoffrion and McBride [62], Fisher [56] and Cornuejols et al. [41]. LR is used to approximate

a difficult problem with a computationally tractable relaxation, of which the solution is a tight

bound to the original problem. LR-based algorithms have successfully solved many problems

in Operational Research such as multidimensional assignment problems [124], facility location

problems [41, 90], and portfolio optimization problems [136]. LR based methods have also

been developed along different directions. First, many researchers attempted to reduce the

integrality gap by modifying the LR procedure. Cornuejols et al. [41] showed that the maximal

integer gap cannot exceed 1/e ≈ 36.79% for p-Median problem. Narciso et al. [116] presented

Lagrangian relaxation with surrogate constraints, numerical results indicated that using sur-

rogates to update multipliers can efficiently improve the convergence process and local bound.

Beltran et al. [11] proposed a Semi-Lagrangian Relaxation (SLR) method which can achieve an


improved bound as compared to the LR method, they also produced more accurate solutions

compared with the regular LR method via solving the p-Median problem. However, surrogate

LR and Semi-LR cannot utilize the decomposition advantage for large scale computation.

Another direction of development is the augmented Lagrangian methods also known as the

method of multipliers [18] in which one penalty term is added to the Lagrangian objective,

e.g. L (x, λ) + ρ2 ‖g (x)‖, to quickly approximate Lagrangian multipliers and therefore speed

up the convergence process. The strategy of augmented Lagrangian takes both advantages of

Lagrangian relaxation and penalty methods. Progressive Hedging (PH) is one main stream of

this type of method to handle the non-anticipativity constraint and to decompose the problem

across scenarios by using Lagrangian dual in Stochastic Programming. This approach highly

emphasizes the mathematical development and computational effectiveness of different problems

in the literature. Rockafellar and Wets [125] proved that the PH method has a linear convergence

rate to the linear type of stochastic programs. Helgason and Wallace [77] approximated the

scenario solutions to improve the convergence performance by solving the fisheries management

problem. They pointed out that exact solutions of subproblems is not required when apply

the PH method to solving the non-linear problems. Mulvey and Vladimirou [115] applied the

PH algorithm to solve network structured Financial Planning problem, which considers the

re-balance cash flows between different stages.

Progressive Hedging also has been extensively studied for mixed integer Stochastic Pro-

gramming. Lokketangen and Woodruff [103] embedded the Tabu search heuristic used for large

size scenario subproblems into the PH algorithm, and provided the computational evidence to

support the effectiveness of the method by solving the production problem includes uncertain

cost structures and demands over multiple-periods. Gade et al. [58] derived a tight bound to

evaluate the quality of the solution of the PH algorithm to mixed integer SP. They showed that

such bound could be as tight as that obtain from Lagrangian dual, which offers a theoretical

support for large application of PH method to mixed integer SP in practice. Crainic et al. [43]

proposed a progressive hedging algorithm with metaheuristic to solve a stochastic variant of the

fixed-charge capacitated multicommodity network design problem. In their method they built

cycle-based neighbourhoods and simultaneously searched the associated γ - residual networks

using Tabu heuristic for sub-problems. Watson and Woodruff [147] presented a mathematical


modification of penalty coefficient for the PH algorithm to a class of stochastic resource allo-

cation model. According to the argument of the problem structure, their innovation for the

accelerators in regularization function decreased the running time and enlarged the solvability

of the PH method to resource allocation problem. Veliz et al. [145] investigated the forest plan-

ning problem that incorporates the uncertainty of harvesting and road construction decisions

in developing country through a mixed integer SP. They applied the PH procedure to obtain

the solution of the realistically sized problem instances.

The Lagrangian dual is a key concept for the reviewed LR methods, we draw the dual

decomposition scheme in the following Figure (3.1), and we will further discuss variants of

Lagrangian relaxation in the literature that relative to different problems in their respective

chapters.

Master problemLagrangianrelaxation sub-problem

k

...

Aggregationof solutions

Satisfy stopcriteria?

Optimal

Dual problemPenalty

Adjustment

initialλ0

N

updatedλv

Y

Figure 3.1: Lagrangian Decomposition Scheme for integer programs

Chapter 4

A Constrained Clustering Approach

for Index Tracking

4.1 Introduction

Index tracking is an important passive investing strategy where one seeks a portfolio of securities

that emulates a given benchmark portfolio such as the S&P500. Several studies [106, 69,

151] have concluded that the actively managed funds usually cannot outperform broad market

indices. For example, Zenios reported that the average return of 769 all-equity actively managed

funds was 2% to 5% lower than the S&P 500 index during the period 1983 – 1989 [151]. Full

replication of the benchmark portfolio is an obvious strategy for tracking where all assets in

the benchmark are held in the quantities as specified by the weightings of the benchmark

portfolio, but full replication is not practical given the transaction costs this would entail.

For example, fully replicating the S&P500 index would require holding the 500 assets along

with weightings for each asset. The weightings are based on market capitalization and so as

soon as the prices of assets change the weights change as well. Constant rebalancing of the

tracking portfolio would result in a prohibitive amount of transactions. An alternative strategy

is to select a strict subset of assets from the benchmark, however, this results in tracking

portfolios that do not match the benchmark as closely as in full replication. A well-known

measure of this discrepancy is called tracking error and is defined as the difference between

returns of the tracking portfolio and benchmark. In general, there will be a trade-off between

28

Chapter 4. A Constrained Clustering Approach for Index Tracking 29

tracking error and transactions costs. Models that seek to minimize tracking error have emerged

as a popular approach for constructing tracking portfolios [86]. Such models exhibit non-

linearity as it is the variance of tracking error that is often minimized or constrained. A further

complication is that in enforcing only a strict subset of assets are selected discrete variables must

be introduced. This constraint is called the cardinality constraint and requires binary variables

for its implementation. Incorporating this aspect along with tracking error minimization into

a model will result in a non-linear integer optimization problem which can present substantial

challenges in computing optimal or near-optimal solutions. Furthermore, most tracking models

e.g. those minimizing tracking error require estimates of expected return of time series of prices

of assets, but it is well known that it is challenging to obtain these estimates and estimation

error could result in substantial bias in optimized portfolios that require these estimates.

In this chapter, we consider linear mixed integer optimization models for tracking broad

market indices such as the S&P 500. The models we consider represent a cluster-based approach

for tracking based on a model of Cornuejols and Tutuncu [40]. The cluster-based approach

seeks to partition the assets in a benchmark portfolio into disjoint clusters from which a single

(representative) asset is selected from each cluster. The set of representatives constitutes the

tracking portfolio. The clusters are grouped to maximize similarity among assets in a cluster.

The number of clusters to generate is a user controlled parameter and is implemented by a

cardinality constraint that explicitly restricts the number of representatives to equal the user

specified number of assets to hold. A measure of similarity can be represented by correlations

between returns of pairs of assets. One of the advantages of the cluster-based models they only

require information about similarity whereas most tracking models e.g. those that use tracking

error require information about expected returns in addition to correlation estimations.

However, a tracking strategy based only on clustering may producing a tracking portfolio

that tracks a benchmark portfolio well in terms of return, but could produce an insufficiently

diverse portfolio when tracking a broad market index such as the S&P 500 thereby increasing

the risk of the tracking portfolio. A market index such as the S&P 500 consists of approx-

imately 500 large cap stocks from 10 different economic sectors such as energy, information

technology, consumer discretionary, consumer staples, materials, financial, utilities, industrials,

telecommunication and services, and health care. The sectors represent the broad and diverse


economy of the United States. A pure clustering solution may result in concentration of assets

into just a few sectors. As such, we consider constraints to ensure that a tracking portfolio

for the S&P 500 contains reasonable representation from each sector. We also consider some

additional important constraints that aim to control transaction costs such as buy-in threshold-

s and turnover constraints [151]. Buy-in threshold constraints ensure that assets selected will

have weights that are not unrealistically small and turnover constraints ensure that the tracking

portfolio does not deviate excessively from a current tracking portfolio. Thus, we propose a

sector constrained linear clustering approach for tracking the S&P 500 with buy-in thresholds.

The models that we propose are linear integer programs and as such can still be challenging

to solve for optimal tracking portfolios, but should be substantially easier than solving non-

linear integer models of the tracking problem and with only modest information requirements.

We propose Lagrangean and Semi-Lagrangean relaxation methods to solve the models and

find that our methods often find optimal or near optimal solutions. Furthermore, the tracking

portfolios from our models are shown to track the S&P 500 effectively with sector diversification

compared to basic clustering approaches without safeguards for diversification.

The rest of the chapter is organized as follows: Section 4.2 briefly surveys the literature on

index tracking. In Section 4.3 we formulate the index tracking models with sector limit and

other practical constraints. In Section 4.4 we develop the Lagrangian relaxation-based methods

for the models. In Section 4.5 computational results are given and we conclude the paper in

Section 4.6.

4.2 Literature Review for Index Tracking

A common approach to the index tracking problem is to formulate it as an integer optimization

problem. One of the major challenges is to deal with the cardinality constraint and a diversity

of algorithmic methods ranging from evolutionary heuristics to methods based on branch-and-

bound have been considered to solve models with cardinality restrictions. A general non-

linear tracking model is considered in Beasley et al. [10] with transaction costs and cardinality

constraint and is solved using evolutionary heuristics in testing five major markets in the world.

Bertsimas et al. [20] considers mixed integer programming to construct a portfolio to track a


given benchmark portfolio with the aim of having fewer stocks with turnover and transaction

costs. Coleman et al. [37] minimize tracking error in the index tracking problem with cardinality

constraints and uses a graduated non-convexity algorithm to satisfy the cardinality restriction.

Jansen and van Dijk [83] convert the cardinality constraint into a continuous non-convex power

function, and apply a diversity method to decide the best stocks and weights of the portfolio.

Oh et al. [119] use genetic algorithms to generate the optimal weights for the selected stocks to

track a benchmark (where the tracking portfolio has strictly fewer assets) where first stocks are

distributed into the sectors with larger market capitalization. Ruiz-Torrubiano and Suarez [129]

apply a hybrid approach that uses a genetic algorithm to select the assets that track different

market indices with fewer assets and use quadratic programming to determine the weights of

the assets selected by the genetic algorithm; other practical constraints such as transaction

cost are not included in their model. Stoyan and Kwon [139] develop a two-stage stochastic

mixed integer programming with recourse which includes several discrete choice constraints

such as buy-in thresholds, cardinality constraints, as well as round lots to track the Toronto

Stock Exchange (TSX). Leujene and Samatli-Pac [100] consider a chance constrained stochastic

programming formulation for the risk averse indexing problem with cardinality constraints and

develop an outer approximation method. Cornuejols and Tutuncu [40] presented an index

tracking model which maximize similarity between selected assets and the assets of the target

index and represents a clustering-based approach for constructing a tracking portfolio. Chen

and Kwon [34] consider a robust version of Cornuejols. Canakgoz and Beasley [29] consider

the enhanced index tracking problem via a mixed integer program where the objective is to

allow outperformance of a benchmark, the model includes transaction cost and is tested on

eight large market indices. Gaivoronski et al. [59] consider different types of risk measurement

for index tracking ranging from mean-variance and conditional value at risk (CVaR) models to

tracking with fewer numbers of assets. Chavez-Bedoya and Birge [32] consider a multi-objective

non-linear programming approach where their model also considers enhanced indexation. The

formulation decomposes the variance of the tracking error of the portfolio so that a model with

fewer variables is obtained.

Most models described above require estimates of expected price or return of assets. In

general, it is difficult to estimate expected returns accurately and portfolio optimization models


can be sensitive to estimation errors of returns [36] and often maximizes the errors found in

estimates [108]. In the next section we develop the models for index tracking that are based

on [40] which do not require expected return estimates, but only require information about

similarity e.g. correlation between the returns of assets.

4.3 Model Formulations

4.3.1 Basic cluster-based index tracking model

The basic index tracking model we adopt is from Cornuejols and Tutuncu [40]. Suppose the

target portfolio has n securities. The model seeks to partition the n securities of the target

portfolio into q disjoint groups (clusters) of securities where securities in a group are the most

”similar” to each other. Then, the model will select a ”representative” from each group. The

q representatives will constitute the tracking portfolio. The correlation of the returns between

pairs of securities is used as the measure of similarity in our experiments, other measures of

similarity such as cointegration or covariance can be used as well [5].

Let ρij represent the correlation (similarity) between security i and asset j and let q denote

the size of tracking portfolio where q < n. For i, j = 1, ..., n, let xij represent whether stock j

is a representative of stock i where xij is 1 if j is the most similar security in the portfolio to

i, or 0 otherwise. For j = 1, ..., n let yj represent the selection of a security to be part of the

tracking portfolio where yj is 1 if security yj is selected or 0 otherwise.

Then, the problem of creating a tracking portfolio can be formulated as follows:

maxn∑i=1

n∑j=1

ρijxij (4.1)

s.t.n∑j=1

yj = q (4.2)

n∑j=1

xij = 1,∀i = 1, · · · , n (4.3)

xij ≤ yj ,∀i = 1, · · · , n, j = 1, · · · , n (4.4)

xij , yj ∈ {0, 1} (4.5)

The objective (4.1) is to select securities so that total similarity of all groups is maximized.


Constraint (4.2) enforces that the tracking portfolio will have exactly q securities and is called a

cardinality constraint. Constraint (4.3) ensures that each security has exactly one representative

in the portfolio. Constraint (4.4) prohibits a security to be a representative of any security if

it is not selected to be part of the tracking portfolio.

The model above only selects securities for the tracking portfolio, but once the model is

solved the investment weight for each selected security expressed as proportion of total invest-

ment can be calculated. In particular, a weight wj be calculated for each selected asset j using

total market value of all securities in the group that security j represents divided by the total

market value of all securities in the target portfolio (index), i.e., wj =∑i Vixij∑i Vi

. For example, if

stock 1 represents stock 2 and 3 in the portfolio, we sum the market values of stock 1, 2 and

3, and then divide the sum by the market value of the n securities in the target portfolio. The

weight for security 1 in the tracking portfolio would be positive assuming that all securities have

positive prices and the weights for securities 2 and 3 would be set to 0 as they would not be in

the tracking portfolio. This follows the capitalization-based weighting that is found in the S&P

500 and other major indices. It should be noted that the models presented in this chapter seek

to track and not outperform the S&P 500 and so this motivates the use of capitalization-style

weightings for the assets selected by the models.

The clustering based model utilizes only linear constraints and therefore is a pure 0–1 linear

integer program. The quality of the tracking portfolio generated by the model is measured

ex-post i.e. tracking error and metrics to measure closeness to the benchmark index portfolio

are computed after the tracking portfolio is generated. An alternative would be to explicitly

have tracking error minimized as the objective in a tracking model. This has been a popular

approach in the practice and literature [86]. However, this would create a non-linearity in the

objective as the variance of the difference of the returns of the tracking and benchmark portfolios

would need to be minimized and in conjunction with cardinality constraint requirements would

result in a quadratic non-linear integer program which is known to be very challenging to solve

[121, 22].

Chen and Kwon [34] have shown that the model (4.1) – (4.5) can track a benchmark portfolio

S&P100 well where the number of securities in the benchmark portfolio is n = 100. Instances

of model (4.1) – (4.5) were able to be solved adequately with exact methods. However, there are


several important practical elements that have not been considered. First, model (4.1) – (4.5)

above lacks transactions costs. It will be most likely in practice that some tracking portfolio

is already extant. It will be important to make sure that a new tracking portfolio is not too

different from the currently existing one as substantial differences will result in higher turnover

and thus higher transactions costs. Model (4.1) – (4.5) will be extended to have turnover

constraints that limit transaction costs. Further, tracking portfolios with small positions are

also limited by incorporating buy-in thresholds in model (4.1) – (4.5).

Second, the tracking portfolio generated from model (4.1) – (4.5) may track a benchmark

well in terms of return, but the portfolio itself may be insufficiently diversified as there is

no constraints that limits portfolio risk. This is an important issue when tracking market

indices such as the S&P 500 as any tracking portfolio should include securities across the 10

different sectors (Consumer Discretionary, Consumer Staples, Energy, Financials, Health Care,

Industrials, Information Technology, Materials, Telecommunications Services, and Utilities)

that comprise a market index. Model (4.1) – (4.5) can be shown to produce tracking portfolios

with securities from only a few e.g. 2 or 3 sectors. This would be problematic for most portfolio

managers concerned about risk and diversification. To this end, constraints that ensure sector

diversification are incorporated in model (4.1) – (4.5).

4.3.2 Model with buy-in threshold and turnover constraints

We now consider the addition of buy-in threshold and turnover constraints in model (4.1) –

(4.5). The resulting model is given in the following formulation:

max

n∑i=1

n∑j=1

ρijxij (4.6)

s.t.

n∑j=1

xij = 1,∀i = 1, · · · , n (4.7)

xij ≤ yj , ∀i = 1, · · · , n, j = 1, · · · , n (4.8)

n∑j=1

yj = q (4.9)

ljyj ≤∑n

i=1 Vixij∑ni=1 Vi

≤ ujyj ,∀j = 1, · · · , n (4.10)


wj =

∑ni=1 Vixij∑ni=1 Vi

,∀j = 1, · · · , n (4.11)

n∑j=1

∣∣w0j − wj

∣∣α ≤ γ (4.12)

xij , yj ∈ {0, 1} (4.13)

Model (4.6) – (4.13) shares the same decision variables and parameters as model (4.1) –

(4.5) but now has the following additional parameters: α is a proportional transaction cost, γ

is the limit on transaction, Vi denotes the market capitalization of stock i at current time, w0j

denotes the proportion of stock j in current portfolio. In addition, model (4.6) – (4.13) has the

variable wj denoting the proportion of wealth invested in stock j for j = 1, ..., n.

The buy-in threshold constraints sets the weight of a stock to be∑i Vixij∑i Vi

which is the

standard market capitalization based weight of assets in indices such as the S&P 500 and is

set to 0 if asset j is not selected. In the transaction cost constraint,∣∣∣w0

j − wj∣∣∣ denotes the

turnover of stock j from buying or selling and the cost of turnover of an asset j is proportional

to the amount of turnover given by∣∣∣w0

j − wj∣∣∣α. The transaction constraint limits the total

proportional turnover (transaction) cost to γ. The absolute value terms in transaction cost

constraint can be removed by introducing auxiliary variables zj , after which the model (4.6) –

(4.13) becomes equivalent to the following model:

max

n∑i=1

n∑j=1

ρijxij (4.14)

s.t.

n∑j=1

xij = 1,∀i = 1, · · · , n (4.15)

xij ≤ yj , ∀i = 1, · · · , n, j = 1, · · · , n (4.16)

n∑j=1

yj = q (4.17)

ljyj ≤∑n

i=1 Vixij∑ni=1 Vi

≤ ujyj ,∀j = 1, · · · , n (4.18)

wj =

∑ni=1 Vixij∑ni=1 Vi

,∀j = 1, · · · , n (4.19)

n∑j=1

zj ≤γ

α(4.20)

zj ≥ w0j − wj , ∀j = 1, · · · , n (4.21)

zj ≥ −(w0j − wj

), ∀j = 1, · · · , n (4.22)


zj ≥ 0, ∀j = 1, · · · , n (4.23)

xij , yj ∈ {0, 1} (4.24)

However, computational experiments in section 4.5.3 show that optimal tracking portfolios

from model (4.1) – (4.5) and model (4.14) – (4.24) are often concentrated in a few sectors

which may result in high portfolio variance or lack of diversification. Therefore, constraints

that impose diversification in a natural way is considered in next section.

4.3.3 Basic model with sector limits

For simplicity of exposition, we first consider diversification (sector limit) constraints for mod-

el (4.1) – (4.5) and then consider the addition of these constraints to model (4.14) – (4.24).

The idea is to classify assets in a tracking model according to what sector an asset belongs to.

For example, in the S&P 500 index the constituent assets are classified as belonging to one of

10 sectors collectively representing the broad economy of the United States. A sector repre-

sents a segment of the economy such as materials, consumer discretionary, consumer staples,

industrials, health care, telecommunication services, financials, utilities, energy, or information

technology.

In general, we assume that the benchmark index consists of K sectors. Let xijk is 1 if stock

j is the most representative of stock i in sector k, 0 otherwise. yjk is equal to 1 if stock j from

sector k is selected to the tracking portfolio, 0 otherwise. |K| is the number of sectors, and nk

denotes the number of assets (stocks) in sector k.

The idea of the sector constrained model is to ensure that there is sufficient investment

across all sectors by creating sub-portfolios for each sector where each sub-portfolio is sought

that maximizes similarity of the sub-portfolio with respect to its sector. Let ρijk denote the

similarity between assets i and j in sector k. 4k and 5k denote the lower and upper bounds

on the cardinality of the sub-portfolio from sector k. qk denotes sub-portfolio size of sector k

and q denotes total portfolio size. Then, model (4.1) – (4.5) modified for sector constraints is

as follows:

max

n∑i=1

n∑j=1

|K|∑k=1

ρijkxijk (4.25)


s.t.

n∑j=1

yjk = qk,∀k = 1, · · · , |K| (4.26)

4k ≤ qk ≤ 5k,∀k = 1, · · · , |K| (4.27)

|K|∑k=1

qk = q (4.28)

n∑j=1

xijk = 1,∀i = 1, · · · , n,∀k = 1, · · · , |K| (4.29)

xijk ≤ yjk, ∀i = 1, · · · , n, j = 1, · · · , n,∀k = 1, · · · , |K| (4.30)

yjk = 0 if j /∈ sector k (4.31)

xijk, yjk ∈ {0, 1} (4.32)

Model (4.25) – (4.32) can be reduced to the following model (4.33) – (4.39) since constraint

(4.31) forces xijk = 0 if the asset i does not belong to sector k.

max

nk∑i=1

nk∑j=1

|K|∑k=1

ρijkxijk (4.33)

s.t.

nk∑j=1

yjk = qk,∀k = 1, · · · , |K| (4.34)

4k ≤ qk ≤ 5k,∀k = 1, · · · , |K| (4.35)

|K|∑k=1

qk = q (4.36)

nk∑j=1

xijk = 1, ∀i = 1, · · · , nk,∀k = 1, · · · , |K| (4.37)

xijk ≤ yjk, ∀i = 1, · · · , n, j = 1, · · · , nk,∀k = 1, · · · , |K| (4.38)

xijk, yjk ∈ {0, 1} (4.39)

4.3.4 The model with trading and sector diversification constraints

We now consider a comprehensive version of a cluster-based model for tracking, model (4.40)

– (4.49), that includes the buy-in thresholds, trading constraints, and the sector diversification

constraints as seen in model (4.14) – (4.24) and model (4.33) – (4.39).

max

nk∑i=1

nk∑j=1

|K|∑k=1

ρijkxijk (4.40)


s.t.

nk∑j=1

yjk = qk, ∀k = 1, · · · , |K| (4.41)

4k ≤ qk ≤ 5k, ∀k = 1, · · · , |K| (4.42)

|K|∑k=1

qk = q (4.43)

nk∑j=1

xijk = 1, ∀i = 1, · · · , nk,∀k = 1, · · · , |K| (4.44)

xijk ≤ yjk,∀i = 1, · · · , n, j = 1, · · · , nk,∀k = 1, · · · , |K| (4.45)

ljkyjk ≤∑nk

i=1 Vikxijk∑ni=1 Vi

≤ ujkyjk,∀j = 1, · · · , nk,∀k = 1, · · · , |K| (4.46)

wjk =

∑nki=1 Vikxijk∑n

i=1 Vi,∀j = 1, · · · , nk, ∀k = 1, · · · , |K| (4.47)

nk∑j=1

|K|∑k=1

∣∣w0jk − wjk

∣∣α ≤ γ (4.48)

xijk, yjk ∈ {0, 1} (4.49)

The parameter w0jk denotes the initial proportion of wealth invested in stock j (from sector

k) which is needed when considering transaction costs (turnover) in the presence of sector

constraints and the decision variable wjk denotes the proportion of wealth invested in stock

j (from sector k). The absolute values that appear in the turnover constraints 4.48 can be

removed by introducing auxiliary continuous variables zjk that represents the turnover amount

for asset j (from sector k) and which represents the aggregate turnover of assets in sector k to

get the the following constraints for turnover:

wjk =


i=1 Vi,∀j = 1, · · · , nk, ∀k = 1, · · · , |K| (4.50)

nk∑j=1

zjk = pk,∀k = 1, · · · , |K| (4.51)

|K|∑k=1

pk ≤γ

α(4.52)

zjk ≥ w0jk − wjk, ∀j = 1, · · · , nk,∀k = 1, · · · , |K| (4.53)

zjk ≥ −(w0jk − wjk

),∀j = 1, · · · , nk,∀k = 1, · · · , |K| (4.54)

zjk ≥ 0, ∀j = 1, · · · , nk, ∀k = 1, · · · , |K| (4.55)


4.3.5 Tractability of the cluster-based Models

The number of variables and constraints in model (4.14) – (4.24) and model (4.33) – (4.39)

is larger than in the base model (4.1) – (4.5) and model (4.40) – (4.49) contains the largest

number of constraints and variables out of all models considered. We solve instances of each

of these models including the base model using the commercial solver Gurobi on a 1.58 GHz

PC with 2GB of RAM. Random instances of the tracking problems were generated where for

each instance q assets will be selected from a benchmark portfolio of n assets where n is chosen

as 100, 200, and 500. We randomly generated multivariate normal distribution for different

n through mvnrnd function in MATLAB, and calculated the associated correlation matrix

ρij . Computational results are presented in Table (4.1). Each row in Table (4.1) is for an

instance of n assets. Moving across each row from left to right we see that as more constraints

are incorporated into model (4.1) – (4.5), the objective values decreases. Moving down each

column we see that instances with larger n have better objective values for each type of model.

Gurobi cannot solve models (4.33) – (4.39) and model (4.40) – (4.49) when n = 500. This

motivates the development of algorithms for model (4.40) – (4.49) so that quality solutions for

instances of n = 500 are possible. Important and popular market indices such as the S&P 500

have 500 assets and so it will be critical to have methods to deal with indices of this size.

Table 4.1: Model test by Gurobi (q = 10)PPPPPPPPPn

Model(4.1) - (4.5) (4.14) - (4.23) (4.33) - (4.39) (4.40) - (4.49)

100 52.0822 46.9684 44.3890 40.0673

300 125.9684 118.1528 119.8971 104.0347

500 215.8263 209.2602 Out of memory Out of memory

4.4 Lagrangian Relaxation Algorithms

Lagrangian relaxation (LR) for integer programming was initially discussed by Geoffrion [61],

Geoffrion and McBride [62], Fisher [56] and Cornuejols et al. [41]. LR is used to approximate

a difficult problem with a computationally tractable relaxation, of which the solution is a tight

bound to the original problem. Since the Lagrangian approximation usually can be decomposed

into a series of sub-problems, LR is also called Lagrangian Decomposition. LR-based methods


have successfully solved many operations research problems such as multidimensional assign-

ment problems [124], facility location problems [41, 90] and portfolio optimization problems

[136]. Many researchers attempt to reduce the integrality gap by modifying the LR procedure.

Narciso et al. [116] presented LR with surrogate constraints, numerical results indicated that

using surrogates to update multipliers can efficiently improve the convergence process and local

bound. Beltran et al. [11] proposed a Semi-Lagrangian Relaxation (SLR) method which can

achieve an improved bound as compared to LR; they also produced more accurate solutions

for the p-Median problem. In this chapter, we applied LR and partial SLR to the developed

index tracking model due to the special structure of the coefficient matrix of the constraints,

we also observed that partial SLR method can improve the solution process and accelerate the

convergence in section 4.5.1.

We present both Lagrangean relaxation and Semi-Lagrangean relaxation methods for prob-

lem (4.40) – (4.49). The rationale for a Lagrangian relaxation is that easy and hard constraints

in the model can be identified and then the hard constraints put in the objective to get a prob-

lem (the Lagrangean dual) whose optimal solution represents the smallest upper bound on the

optimal solution of the original problem (4.40) – (4.49) but is easier solve for. In particular, 2

constraints in problem (4.40) – (4.49),i.e.∑|K|

k=1 qk = q and∑|K|

k=1 pk ≤γα , can be put into the

objective function by using the Lagrange multipliers λ and µ, respectively. Then a relaxation

(L) of the original problem is the following:

L (x, y, z, λ, µ) = max(x,y,z)

nk∑i=1

nk∑j=1

|K|∑k=1

ρijkxijk − λ

(|K|∑k=1

qk − q

)− µ

(|K|∑k=1

pk − γα

)

= max(x,y,z)

|K|∑k=1

[nk∑i=1

nk∑j=1

ρijkxijk − λqk − µpk

]+ λq + µγ

α

=|K|∑k=1

[max(x,y,z)

nk∑i=1

nk∑j=1

ρijkxijk − λqk − µpk

]+ λq + µγ

α

L (x, y, z, λ, µ) can be decomposed across different sectors, and the associated kth sector

sub-problem becomes:

max(x,y,z)

nk∑i=1

nk∑j=1

ρijkxijk − λqk − µpk (4.56)

s.t.

nk∑j=1

yjk = qk, ∀k = 1, · · · , |K| (4.57)


4k ≤ qk ≤ 5k, ∀k = 1, · · · , |K| (4.58)

nk∑j=1

xijk = 1,∀i = 1, · · · , nk, ∀k = 1, · · · , |K| (4.59)

xijk ≤ yjk,∀i = 1, · · · , n, j = 1, · · · , nk,∀k = 1, · · · , |K| (4.60)

ljkyjk ≤∑nk

i=1 Vikxijk∑ni=1 Vi

≤ ujkyjk,∀j = 1, · · · , nk,∀k = 1, · · · , |K| (4.61)

wjk =


i=1 Vi, ∀j = 1, · · · , nk, ∀k = 1, · · · , |K| (4.62)

nk∑j=1

zjk = pk,∀k = 1, · · · , |K| (4.63)

zjk ≥ w0jk − wjk,∀j = 1, · · · , nk,∀k = 1, · · · , |K| (4.64)

zjk ≥ −(w0jk − wjk

),∀j = 1, · · · , nk, ∀k = 1, · · · , |K| (4.65)

xijk, yjk ∈ {0, 1} , zjk ≥ 0,∀i = 1, · · · , nk, j = 1, · · · , nk,∀k = 1, · · · , |K| (4.66)

Solution to model (4.56) – (4.66) is easier than that to model (4.40) – (4.49) because we

can solve for |K| times standard model (4.14) - (4.24) but much smaller size under fixed (λ, µ).

The dual problem is min(λ,µ≥0)

L (x, y, z, λ, µ), whose optimal solution will provide the lowest upper

bound for problem (4.40) – (4.49). The Lagrangean dual will be solved with a Golden Section

Search method and sub-gradient method separately with heuristics for feasibility. This forms the

basis of the Lagrangian relaxation algorithm for solving problem (4.40) – (4.49). We summarize

the Lagrangian relaxation algorithm as follows:

Lagrangian Relaxation Algorithm

Step 0: (Initialization)

v ←− 0, λ(v) ←− 1, µ(v) ←− 0

Step 1: (Dual Decomposition)

For k ∈ K, Solve the corresponding sector sub-problem L(x, y, z, λ(v), µ(v)

)kUBD ←−

∑|K|k=1 L

(x, y, z, λ(v), µ(v)

)k+ λ(v)q + µ(v)γ

α

If(x(v), y(v), z(v)

)′is feasible to model (4.40) - (4.49), LBD ←− UBD, STOP.

Else find a feasible solution (and associated LBD)

by Heuristic I (v = 0) or II (v > 0), gap(v) = UBD−LBD|LBD|


Step 2: (Lagrangian Multiplier Update)

Build Lagrangian dual problem minµ≥0 L(x(v), y(v), z(v), λ, µ

)Update step size t(v) by Golden Section Search (GSS)

and Bi-section methods respectively.

λ(v+1) = λ(v) + t(v)(∑|K|

k=1 qk − q)

µ(v+1) = max(

0, µ(v) + t(v)(∑|K|

k=1 pk −γα

))Solve (L) with new multiplier

(λ(v+1), µ(v+1)

)Step 3: (Move to next iteration)

If gap(v) > ε, v < V

v = v + 1. GO TO Step 1.

Here are some remarks when we implement the LR algorithm:

(1) In Step 1, Heuristic I is applied to obtain a initial solution to trigger the iterations. Then

a vector qk can be returned by solving (LR) at each iteration, if the solution is infeasible,

a more sophisticated heuristic (heuristic II) is applied to satisfy the global constraints,

i.e.∑|K|

i=1 qk = q and∑|K|

i=1 pk ≤γα , and the associated lower bound can be updated.

Let m (k) denotes the size of sector k. Q = {qk, k = 1, · · · , |K|} be a vector satisfies the

cardinality constraint in model (4.40) – (4.49) and Q′ = {q′k, k = 1, · · · , |K|} be another

vector that also satisfies the cardinality constraint in model (4.40) – (4.49) but different

than Q, QLR ={qLRk , k = 1, · · · , |K|

}be a vector satisfies the cardinality constraint in

model (L). I, I ′ and ILR be the associated index set of Q, Q′ and QLR, respectively. We

first describe the Heuristic I as follows:

Heuristic 1 : Heuristic I for initial lower bound

(0) Sort market capitalization of assets in a descending order and put in vector V ,Chose the first q assets in V that satisfy sector cardinality bounds and weight bounds;Obtain a Q vector.


(1) Divide the index of Q into 3 groups:I1 = {h|Qh = 0} sort I1 in descending order according to {m (h) |h ∈ I1}

I2 = {i|Qi 6= 0, i ∈ {index set of first l largest Qi}}I3 = {j|Qj 6= 0, j ∈ I\I1\I2}, sort I3 according to {(Qj ,m (j)) |j ∈ I3}

Switch portion of indices between I1, I2 and I3

Generate N neighborhood points Q′ around Q.

(2) Solve (L) without constraint∑|K|

k=1 pk ≤γα under Q′

(3) Test transaction cost constraint (TC);Choose solution Q′ better than Q and satisfy (TC) if it exists, STOP; elseGO TO (1).

Step (0) in Heuristic I guarantees that a starting solution will satisfy the transaction cost

constraint by emphasizing the selection of assets with larger market capitalization. For

example, suppose V = (10000, 100, 10)T and associated w0j = (0.9891, 0.0099, 0.0010)T ,

if the first asset is not selected to the tracking portfolio, the turnover weight is 98.91%

and is much larger than the maximal turnover weights of the second and third assets,

so the turnover constraint will be easily violated. We then generate the neighborhood of

points around Q in Step (1) by choosing pairs of sectors between sectors as indexed by

the subsets I1, I2 and I3, and and swapping pair-wise.

The philosophy behind the swap rules is to generate only a small size of neighborhood

points such that swaps attempt to distribute the assets to more sectors so that the ob-

jective value becomes better. In Step (1), we sort I3 in increasing order according to

{Qj |j ∈ I3}. If elements in {Qj |j ∈ I3} are equal, we then sort I3 in descending order

according to {m (j) |j ∈ I3}. We always select sectors at front position of the index sets

I1, I2 and I3, and switch 2 assets between pairs of these three groups in Step (1). If no

improvement occurs at the current iteration, the sectors with different positions in the

index sets are selected in next iteration. For example, parallel swapping steps include:

1O Pick 2 assets from ath sector in I2 to bth sector in I3, obtain a Q′; 2O Or pick 2 assets

from ath sector in I2 to bth sector in I1, obtain a Q′; 3O Or pick 2 assets from ath sector

in I2, add 1 asset to bth sector in I3 and 1 asset to cth sector in I2, obtain a Q′; 4O Or

pick 1 asset from ath sector in I2 and 1 asset from bth sector in I3 , add them to cth sector

in I1, obtain a Q′. Here the indices a, b, and c are generally set as small values since


switching other indices may be inefficient to improve the objective, e.g. a, b, c are set

no more than 2 times in our computation. We leave a detailed numerical example that

illustrates Heuristic I in the section A. 1 of Appendix A to interested readers.

Heuristic 2 : Heuristic II for updated lower bound

(1) Adjust a given QLR vector as follows:Pick

{k|min

{QLRk

}}, if QLRk ≤ m (k), q′k = qLRk , else q′k = m (k)

Repeat above steps unless q −∑|K|

k=1 q′k = 0; if q −

∑|K|k=1 q

′k > 0, add the difference

into the sector has maximal number of assets; solve (L) with Q′ vector;

(2) Test transaction cost constraint (TC);If solution satisfies TC, STOP, else, GO TO Step (3);

(3) Within each sector k, do:

I1 ={w0jk|j ∈ {Q′k}

}, sort I1 in increasing order.

I2 ={w0jk|j ∈ {m (k)} \ {Q′k}

}, sort I2 in decreasing order.

Switch first 4 assets between I1 and I2. Solve (L) with new Q′ vectorsTest TC, if TC satisfied, STOP, else GO TO Step (4)

(4) Pick two sectors that have large (k1) and small (k2) asset number in Q′, do:

I1 ={w0jk|j ∈ {m (k1)}

}, sort I1 in increasing order.

I2 ={w0jk|j ∈ {m (k2)}

}, sort I2 in decreasing order.

Swap first 4 assets between I1 and I2. Obtain new QLR vectors, GO TO (1)

If the TC cannot be satisfied in Step (3) in Heuristic II, we adjust the portfolio by capital

weights in the same sector, and then adjust the portfolio between the sectors in Step

(4) if necessary. We selected the sectors with large and small stocks because so as to

not lose too much objective value. Like we did in Heuristic I, we always go back to the

assets have larger capital weights to adjust the constructed portfolio. It is a trade-off

between the Cardinality and Transaction Cost constraints. How to exchange the assets

between sectors in Step (4)? One approach is Variable Neighborhood Search (VNS) [75].

We describe the steps here we implemented: (1) Shaking - randomly perturb some assets

between max(QLR

)and min

(QLR

)from current solution; (2) Local search - search the

selected neighborhood region, i.e. the new Q′ vectors. (3) Move or not – if an improved

solution obtained. Our computational observation is that in most of instances Step (3)

and (4) are needed to achieve a feasible solution for transaction cost constraints, which

indicates that the cardinality and TC constraints are a computational challenge to satisfy

as they run in opposing directions. We also leave a numerical example that illustrates


Heuristic II in the section A. 2 of Appendix A to readers.

(2) In Step 2 in the LR algorithm, step size t(v) was updated by Golden Section Search (GSS)

and Bi-section methods respectively.

Algorithm 3 : Golden Section Search (GSS) for step size in Step 2 of LR algorithm

Set scalars A and B, A ≤ B,

t(v) =

{(t(v)1

t(v)2

)∣∣∣∣∣ t(v)1 = A+ .382 (B −A)

t(v)2 = A+ .618 (B −A)

}1O

λ(v+1) = λ(v) + t(v)(∑|K|

k=1 qk − q)

2O

µ(v+1) = max(

0, µ(v) + t(v)(∑|K|

k=1 pk −γα

))3O

Solve (L) with new multiplier(λ(v+1), µ(v+1)

)′4O

while (B −A) ≥ ε

B = t(v)2 if L

t(v)1

< Lt(v)2

or A = t(v)1 if L

t(v)1

≥ Lt(v)2

repeat 1O - 4O

GSS has been proved that it can perform with a linear convergence rate with τ =√

5−12 ≈

0.618 to one dimension search problem [9], this feature initially attract us to apply it for

dual updating in our algorithm. However, GSS slow down the whole LR algorithm since

it tries to obtain the best dual objective at each iteration. On the other hand, bi-section

method for one dimension searches in sub-gradient method has been widely used in LR

algorithm [61, 56]. The details of the bi-section method are presented below.

Algorithm 4 : Bi-section search for step size in Step 2 of LR algorithm

Set initial σ,

λ(v+1) = λ(v) + σt(v)(∑|K|

k=1 qk − q)

1O

µ(v+1) = max(

0, µ(v) + σt(v)(∑|K|

k=1 pk −γα

))2O

where t(v) = UBD−LBD∥∥∥∑|K|i=1 qk−q;∑|K|i=1 pk−

γα

∥∥∥Solve (L) with new multiplier

(λ(v+1), µ(v+1)

)′3O

while L(x, y, x, λ(v+1), µ(v+1)

)≥ UBD and σ ≥ ε

σ = .5σ, repeat 1O - 3O

How to determine the step size tv? To illustrate the problem, let’s simplify the Lagrangian

maxω minx LR (xv, ωv) = cTxv + (ωv)T (Bxv − b). We know that dv = Bxv − b is the


gradient to Lagrangian function at xv, suppose ωv+1 = ωv + tv ∗ dv, then

LR(xv, ωv+1

)= cTxv +

(ωv+1

)T(Bxv − b)

= cTxv + (ωv)T (Bxv − b) + tv (dv)T (Bxv − b)

= cTxv + (ωv)T (Bxv − b) + tv (Bxv − b)T (Bxv − b)

= LR (xv, ωv) + tv (Bxv − b)T (Bxv − b)

= LR (xv, ωv) + tv ‖Bxv − b‖2

=⇒ tv =LR

(xv, ωv+1

)− LR (xv, ωv)

‖Bxv − b‖2=BestUB − CurrentLB

‖Bxv − b‖2

In the bi-section method we initialize tv = σ(BestUB−CurrentLB)

‖Bxv−b‖2 where σ > 1, if the

objective LR(xv, ωv+1

)≤ LR (xv, ωv), the step size is reduced by half in each iteration

and the main advantage is that we can quickly update the dual variables. The step size

after k iterations is tv

2k, so it may require exactly dlog2 (tv/ε)e iterations in worst case

scenario for dual variable updating. We present the numerical comparison of the LR

method with GSS and Bi-section in following Table (4.2):

Table 4.2: Time comparison for updating dual in LR methodLR method with Bi-section search LR method with Golden section search

q Fesi. LB LR UB Gap Time (S) Fesi. LB LR UB Gap Time (S) Tgss/Tbi10 199.971 199.971 0.00% 187.9 199.0133 200.0027 0.49 1041.91 5.544950 256.0996 258.8329 1.06% 402.66 256.0693 258.8563 1.08 1360.95 3.3799100 287.6047 308.8329 6.87% 415.94 287.5744 308.8563 6.89 1135.85 2.7308150 315.4958 358.8329 12.08% 389.36 315.4301 358.8563 12.10 2097.89 5.3881200 338.0883 408.8329 17.30% 334.02 338.2455 408.8563 17.27 1228.03 3.6766

Aver. / / 7.46% 345.98 / / 7.57% 1372.93 4.1441

In our computation we set initial A = 0, B = 8 for GSS and σ = 20 for Bi-section search.

All other parameters in LR method are kept same. From Table (4.2), we see that lower

and upper bounds for all instances are close to each other. However, the searching time

for step size by Bi-section search is much less than that by GSS, the average search time of

GSS is 4 times larger than that from Bi-section search. The reason is that the Bi-section

search does not require the steps generate best dual objective value, it terminate when a

better dual objective is found and start a new outer loop in LR method and speed up the

whole algorithm. Therefore, we mainly use Bi-section search for dual variable updating

in our computation.


It is easy to show that solving model (4.40) - (4.49) by LR Algorithm takes the running time

of approximate V ∗K ∗Tsub ≈ O(n2), where Tsub is the average time of solving a sub-problem,

K is the sector number, and V is the iteration number. Since Tsub depends on the capacity of

the solver, usually Tsub is constant on average. If we fixed V , as K increase, the problem can

be always solved within a predictable time.

Note that in Table (4.2), there still exist large gap between the bounds for q = 150, 200, we

hope a tighter LR upper bound so that the gap can be shrank. One possible extension of the

LR algorithm is Semi-Lagrangian Relaxation (SLR), a LR approach with more strict feasible

region and therefore tighter bound. Due to the decomposition requirement in main algorithm

structure, the global constraints cannot be returned into the constraint set. However, other

types of constraint can be relaxed and then returned to constraint set and partially satisfy the

SLR framework. This procedure is called partial SLR [11] and suitable for our problem. In

particular, after relaxed the assignment constraint, we put the assignment constraint back and

formulate the partial SLR as follows:

L (x, y, z, λ, µ, θ)

= max(x,y,z)

nk∑i=1

nk∑j=1

|K|∑k=1

ρijkxijk−λ

(|K|∑k=1

qk − q

)−µ

(|K|∑k=1

pk − γα

)−nk∑i=1

|K|∑k=1

θik

(nk∑j=1

xijk − 1

)

=|K|∑k=1

[max(x,y,z)

nk∑i=1

nk∑j=1

(ρijk − θik)xijk − λqk − µpk

]+ λq + µγ

α +nk∑i=1

|K|∑k=1

θik

=|K|∑k=1

[max(x,y,z)

nk∑i=1

nk∑j=1

Pijkxijk − λqk − µpk

]+ λq + µγ

α +nk∑i=1

|K|∑k=1

θik

Then the kth SLR problem can be formulated as follows:

max(x,y,z)

nk∑i=1

nk∑j=1

Pijkxijk − λqk − µpk (4.67)

s.t. (4.57)− (4.58), (4.60)− (4.66)

nk∑j=1

xijk ≤ 1,∀i = 1, · · · , nk, ∀k = 1, · · · , |K| (Relaxed assignment) (4.68)

and the dual problem becomes max(λ,µ>0,θ>0)

L (x, y, z, λ, µ, θ). Then, the LR framework can be

applied to the partial SLR construct. We present the Semi-Lagrangian-based Algorithm as

follows:


Algorithm 5 : Semi-Lagrangian Relaxation Algorithm


v ←− 0, λ(v) ←− 1, µ(v) ←− 0

θ(v)ik ←− 0,∀i ∈ nk, k ∈ K


For k ∈ K, do Pijk ←− ρ(k)ijk − θ

(v)ik ,∀i ∈ nk, j ∈ nk

Solve the corresponding sector sub-problem L(x, y, z, λ(v), µ(v), θ

(v)ik

)kUBD ←−

∑|K|k=1 L

(x, y, z, λ(v), µ(v), θ

(v)ik

)k+ λ(v)q + µ(v)γ

α +∑|K|

k=1 θ(v)ik

If(x(v), y(v), z(v)

)′is feasible to model (4.40) - (4.49), LBD ←− UBD, STOP

Else find a feasible solution (and associated LBD)

by Heuristic I (v = 0) or II (v > 0), gap(v) = UBD−LBD|LBD|

Step 2: (Lagrangian Multiplier Update)

Build Lagrangian dual problem minµ≥0 L(x(v), y(v), z(v), λ, µ, θ

)Update step size t(v) by Bi-section methods.

λ(v+1) = λ(v) + t(v)(∑|K|

k=1 qk − q)

µ(v+1) = max(

0, µ(v) + t(v)(∑|K|

k=1 pk −γα

))θ

(v+1)ik = max

(0, θ

(v)ik + t(v)

(∑nkj=1 xijk − 1

))Solve (Partial SLR) with new multiplier

(λ(v+1), µ(v+1), θ

(v+1)ik

)

Step 3: (Move to next Iteration)

If gap(v) > ε, v < V

v = v + 1. GO TO Step 1.


The feasible lower bound is generated by the same Heuristics as LR algorithm. In Step 2,

sub-gradient method with Bi-section search [61], [56] was applied to calculate the dual variable(λ(v+1), µ(v+1), θ

(v+1)ik

)′for SLR algorithm. The computation is terminated if optimal solution

obtained in Step 1 or gap tolerance or iteration number reached in Steps 3. As mentioned in

[11], partial SLR cannot guarantee a tighter bound. However, it returns a better bound than

LR in some instances in our computation, we will compare the result from LR and SLR in next

section.

4.5 Computational Results: Tracking the S&P500

In this section we give the computational results from using the LR and SLR methods to solve

model (4.40) – (4.49). The S&P 500 index is used as the target benchmark.

4.5.1 Parameter Estimation

To generate the correlation matrix ρij for S&P500, we collected the historical price information

of all components of S&P500, and calculated the daily returns by rit =Pit−Pi,t−1

Pi,t−1, where Pit,

Pi,t−1 are the adjusted closing prices at time t and t − 1. Then daily returns were used to

calculate the mean returns of assets and covariance matrix between different assets:

µi =1

T

T∑t=1

rit, covij =1

T

T∑t=1

(rit − µi) (rjt − µj)

Here we use one year’s daily return (T=252) to generate correlation matrix, i.e. ρij =

covij√covii∗covjj , for all models, and we calculate the correlation matrices by using data from 4 time

intervals which were [2006 2007], [2007 2008], [2008 2009] and [2010 2011] respectively. Some

stocks in the S&P500 index may be replaced by some other stocks outside of the index since

they do not satisfy the selection criteria of S&P500 in the designed time period, we retrieved

the stocks that were moved out into the designed intervals and the associated price information.

For example, ABK was replaced by LO in June 10, 2008, and then we used the price information

of ABK rather than the data of LO to calculate that before 2008. Usually this replacement is

rarely and the components of S&P500 are stable.

According to Global Industry Classification Standard (GICS) Sector criterion [3], the com-


ponents of S&P500 index are selected from 10 main sectors in US market and we indicate sector

1 – 10 represent Consumer Discretionary, Consumer Staples, Energy, Financials, Health Care,

Industrials, Information Technology, Materials, Telecommunications Services, and Utilities in

this research. Sector size vector m(k) = [82 41 41 81 51 62 70 29 8 35]T at the time of this

research. We adjusted the number of stocks in each sector for designed intervals if necessary and

computed the associated correlation matrix for the models include the sector limit constraint.

Ticker across Sectors in S&P500 are displayed in the Table (A.1) of Appendix A.

We normalized the marker value of each component to calculate the component weight, and

used these weights as previous proportion, i.e. w0j , for transaction cost constraint in model

(4.14) - (4.24) and model (4.40) - (4.49). All necessary data were obtained from the Financial

Research and Trading Lab at University of Toronto. All models were computed by Gurobi 4.5.1

with a MATLAB interface Gurobi Mex [150]. We set the initial(λ0, µ0

)= (1, 0) for LR and(

λ0, µ0, θ0)

= (1, 0,0) for SLR, and Table (4.3) gave the parameter setting when we implement

the algorithm.

Table 4.3: Parameter Setting

α .001

γ .05

4k 0

5k Maximum stock number of sector k

ljk .001

ujk 1

w0j Normalize the market capitalization of component of SP500

4.5.2 LR versus SLR

We computed solutions for model (4.40) - (4.49) over portfolio sizes ranging from 10 to 350 in

increments of 10 assets, the upper bound (UB) decreased and the lower bound (LB) increased

iteratively in the LR algorithm and ideally a global optimal solution was achieved when the UB

equals LB. Although the LR method cannot guarantee the optimal solution for every instance,

it returned a minimal UB when the computation was terminated, and a bound associated with

the heuristic can be used to approximate the optimal solution.

Figure (4.1) depicted the computational comparison by LR and partial SLR, where the


maximal gaps between the lower and upper bound were 2.37% and 4.59% respectively. Most

of the gaps were under 0.5%, especially in the practical interval, [50 200] (see Table (A.2) in

Appendix A). In some cases, SLR returned a better bound and a smaller gap than LR (see

q = 20, 80), and in other cases SLR was worse than LR (see q = 250). However, the running

time by SLR (average 1.83 hrs) was generally smaller than LR (average 2.87 hrs). We main

used a partial SLR algorithm to approximate the optimal solution in the next section, since it

returned a better solution relative to LR in the practice region q ∈ [10, 200].

Figure 4.1: Gap Comparison between LR and SLR


4.5.3 Comparison between 4 models

Differences of portfolio efficiency and allocation

We denote model (4.1) - (4.5) as the model (1), model (4.14) - (4.23) as the model (2), model

(4.33) - (4.39) as the model (3), and model (4.40) - (4.49) as the model (4) in the rest of chapter.

Four different portfolios were constructed by model (1) - (4). We illustrated the models

with portfolio size q equal 10, 30, 100, which represent the low, medium and high density

separately. Figure A.1 in Appendix A shows the details about the sector difference between the

4 portfolios. S&P500 index is collected from 10 sectors where sector 1 - 10 represent Consumer

Discretionary, Consumer Staples, Energy, Financials, Health Care, Industrials, Information

Technology, Materials, Telecommunications Services, and Utilities respectively.

We varied the portfolio size q and compared the associated portfolios by different models.

Interesting results include: (1) The tendency of sector diversification. The computational

results for all period intervals demonstrated that without sector limit constraint, the portfolio

allocation are concentrated in fewer sectors (see q = 10, 30). This sector diversification can

explain the reason why the portfolio with sector limit has a lower variance in next section.

(2) The model with sector limit has a constant sector weight with respect to the changes of

size q. Although Bertsimas and Shioda [22] pointed out that the investment in different sector

must be limited, they have not explored how to decide the best sector investment fraction in

their model. In this chapter, our numerical results shown that the optimal sector weights were

consistent to the sector weights of the target index.

Figure (4.2) have shown the norm value of the difference in the sector weights between

the tracking portfolios and target S&P500. “TC” in all figures represents transaction cost and

turnover constraints in model (2), and “sector” in all figures refers to the sector limit constraints

in model (3). It is clear to see that when the sector limited constraint is considered, i.e. model

(3) and model (4), the sector weight of the constructed portfolios was more close to the S&P500

than the model without the sector limit constraint, i.e. model (1) and model (2). Figure (4.2)

was drawn based on all computational results under different q from 10 to 100. Because of the

space limitation, we listed the numerical result (q = 10, 30, 100) about sector weight on Figure

A.1 in Appendix A.


Figure 4.2: Norm of sector differences between constructed portfolio and S&P500

Figure (4.3) illustrated the sector diversification process. For a small portfolio size (q=10),

the stocks only distributed in 5 sectors when the sector limit constraint was not incorporated

(model 1 and 2) while the stocks will distributed in 10 sectors if we considered sector limit

(model (3) and (4)). This same situation existed when portfolio size increased to q=30, 100.

One major advantage of sector limit constraint is that the diversification in sectors can reduce

the portfolio risk. The sector limit constraint can make the investment allocation even across

10 sectors, i.e. the maximal sector fraction without sector limit is constantly larger than the

fractions with sector limit. For instance, people will invest 59.27% by model (1) and 47.23% by

model (2), the largest weight of their budget, to the sector of financials if only transaction costs

and buy-in threshold constraints were incorporated into model (1). In contrast, the maximum

sector weight is only 17.47% for the sector of Information Technology by model (3) and 18.39%

by model (4). More comparison result that relative to diversification process will be discussed

in next Section.


Figure 4.3: Sector diversification


Comparison of Performance Metrics

In this section we compare the performance of the portfolios constructed by Model (1)-(4). The

performance metrics include optimal objective values, portfolio return, portfolio variance, port-

folio Sharpe ratio and tracking ratio. Intuitively, the objective value is the first consideration of

the comparison between different models since it denotes the similarity of the constructed port-

folio with the original index that is tracked. The portfolio return is an important aspect of the

performance of the generated tracking portfolios, and the portfolio variance is a prevalent risk

measurement of the constructed portfolios. The Sharpe ratio [135], or the information ratio,

which measure the risk/return efficiency of excess return was the third comparison because it

can describe the trade-off between the excess return to the market and the associated portfolio

risk. Finally the tracking ratio was used to compare the tracking quality of the portfolio during

different out-of-sample period under different restriction. Figures 5-9 show the numerical result

with respect to different portfolio size q from 10 to 100 per 10 units for different time periods.

As shown in Figure (4.4), the optimal objective value increased with respect to portfolio

size q. Model (1) gives the greatest objective value while model (4) presents the smallest value,

which is reasonable since model (4) includes all types of constraint. The objective value of model

with sector limit is less than that without sector limit. For example, for any given specific q,

the value of model (1) is larger than that of the model (3) and the value of model (2) is larger

than that of model (4). This is obvious as more strict constraints are added into the underlying

model. Compared with model (2) and (3), we can see that the sector limit constraint affected

the objective value more significant than the transaction costs and buy-in threshold constraint,

i.e. the value of model (3) decreased faster than value of model (2). One explanation is that

the sector limit constraint is a global restriction which dominates the local constraints such as

transaction costs. When the local constraints were incorporated, the objective value changed

progressively (see the lines about model (1) and (2)). In contrast, the objective value can

changed dramatically with the impact of the global constraint (see the lines about model (1)

and model (3)).


Figure 4.4: Comparison of Performance – optimal objective value

Figure (4.5) presented the tendency of the portfolio return by different models in response

to the changing of portfolio size. The straight line in each plot in Figure (4.5) indicates the

yearly return of market index, S&P500. The main goal of the tracking portfolio is to match the

return of the market index, as can be seen from Figure 6, the portfolio returns under different

models moved close to the return of the target when the portfolio size became larger. For

example, the returns with q = 10 deviated further from the straight line than the returns with

q = 100 in 2008. The reason is that when more stocks were allowed to hold, more chance of the

full replication could be achieved. The portfolio returns resulted from sector limit constraint

(see the lines about model (3) and model (4)) were close to each other. Likewise, the portfolio

returns without sector limit constraint (see the lines about model (1) and model (2)) approached

each other. An interesting observation is that the path of the model (2) matches the path of


model (1) in every sub-figure, while the lines of model (3) did not follow that of model (1).

As we mentioned, the local constraints such as transaction costs may slowly affect the solution

structure with the change of the size, so the path of model (2) was close to path of the underlying

model (1). On the other hand, the global constraint such as sector limit may lead to a totally

different solution, which created different portfolio returns compare with the returns of model

(1). Overall portfolio return changes with respect to the different restrictions and they are close

to the return line of the target index.

Figure 4.5: Comparison of Performance – portfolio return

The tendency of the portfolio variance under different models with respect to the portfolio

size is plotted in Figure (4.6). The straight line indicates the yearly variance of market index,

S&P500. The smaller the value of the variance is, the better the portfolio performs. Model (1)

and (4) produced the upper bound and lower bound of the variance. It can been seen from


the data in the Figure (4.6), the variance value by model (1) was 3 to 7 times higher than the

variance value by the model (3). It is apparent from the figure that the portfolio variance with

sector limit constraint (see the lines about model (3) and model (4)) was less than the variance

without sector limit constraint (see the lines about model (1) and model (2)). The reason is

that the sector diversification process distributed the limited number of the stocks into different

independent sectors and hedged against the potential risk.

Figure 4.6: Comparison of Performance – portfolio variance

The portfolios by model (2) tended to perform worse than the portfolios by model (3) in

terms of the portfolio risk. This indicated that compared with the limitation of the total trans-

action cost, the sector diversification is a more efficient strategy to control the risk of tracking

portfolio. Therefore, the portfolio variance can decrease when the sector limit constraint was

incorporated into mode (2), i.e. the line of model (2) moved down to the line of mode (4) in


2007 and 2008.

Interestingly, the portfolio variance increased if the transaction costs and buy-in threshold

constraints were added into the model (3). The reason is that the solution structure of each

sector sub-problem became worse as the local constraints were incorporated, which result in

the higher portfolio variance, i.e. the line of model (3) moved up to the line of mode (4) for

every sub-figure.

The Sharpe ratio was calculated as the difference in returns between a tracking portfolio and

the market divided by the standard deviation of the difference in variance between the portfolio

and the market. The higher Sharpe ratio value, the better performance of the portfolio occurred.

The straight line indicates the yearly Sharpe ratio of market index, S&P500.

From the Figure (4.7), we can see that the difference of the Sharp Ratios between model (3)

and model (1) was larger than the difference of the Sharp Ratios between model (2) and model

(1), which indicated that the sector limit constraint improved the Sharp Ratios more better

than the transaction costs and buy-in threshold constraints for the same underlying model. For

example, for q = 20 in 2007, the Sharp Ratio difference between model (3) and model (1) was

0.8 but the Sharp Ratio difference between model (2) and model (1) was -0.5, which means

the sector limit constraint increased the Sharp Ratio of model (1) but the transaction costs

and buy-in threshold constraints decreased the Sharp Ratio of model (1). All the Sharp Ratio

values were negative in 2009 when the financial market dropped sharply. The Sharp Ratio value

of model (1) was close to the Sharp Ratio value of the target, and the model (3) returned most

negative values. However, as more local constraints were incorporated, the Sharp Ratio values

were increased, i.e. the line of model (3) moved up to the line of model (4) in the sub-figure of

2009. Overall, from the Sharp Ratio perspective the portfolios with the sector limit constraint

had over-performance to the portfolios without the sector limit constraint. The reason is that

the sector limit constraint improved the denominator part of the Sharp Ratio given that the

excess returns were close to each other.


Figure 4.7: Comparison of Performance – portfolio Sharpe ratio

Next we calculate the Sharpe Ratio for the period of out-of-samples and compare the dif-

ference between Sharpe ratios of in-samples and out-of-samples. The periods of in-samples are

the intervals of [2006 2007], [2007 2008], [2008 2009] and [2010 2011] and the associated out-

of-samples are the daily returns in intervals of 2007.01 – 2008.01, 2008.01 – 2009.01, 2009.01 –

2010.01 and 2011.01 – 2011.06 respectively. The numerical results shown in the following Table

(4.4) – 7 with respect to the portfolio size. The ‘diff’ columns are the difference of Sharpe ratio

by out-of sample (even columns) subtracts the value that from in-samples (Figure 4.7), a posi-

tive number indicates a portfolio still keep good performance during the out-of-samples, and a

negative value means the portfolio constructed by the Model has a relative underperformance

in the associated period without any re-balance.


Table 4.4: Sharpe ratio for out-of-samples (2007.01 - 2008.01)q Model (1) diff Model (2) diff Model (3) diff Model (4) diff10 -0.0250 -0.7426 0.3918 0.0661 0.5313 -0.7192 0.6134 -0.001820 0.2774 -0.5298 0.0885 -0.3244 0.2845 -1.1435 0.2849 -0.202130 0.4229 -0.3665 0.3495 0.0298 0.5334 -1.0802 0.632 0.132540 0.4448 -0.3566 0.2209 -0.0858 0.4309 -1.1776 0.5143 0.103450 0.4548 -0.3350 0.0952 -0.1789 0.5605 -1.0033 0.5156 0.096060 0.1035 -0.6733 0.1558 -0.0871 0.5134 -0.8991 0.6108 0.338370 0.0656 -0.6649 0.1160 -0.1012 0.5741 -0.7875 0.4562 0.216680 0.0504 -0.6702 0.1373 -0.1493 0.4573 -0.9863 0.5284 0.143390 0.0612 -0.6650 0.2083 -0.0841 0.4384 -0.8436 0.6103 0.2578100 0.1295 -0.6611 0.1722 -0.1163 0.4708 -0.7944 0.6494 0.3797

Aver. 0.1985 -0.5665 0.1935 -0.1031 0.4795 -0.9435 0.5415 0.1464

From Table (4.4), we see that the Sharpe ratio values for out-of-sample by Model (3) and

(4) are generally larger than that from Model (1) and (2), which indicate the model with

sector limit has better performance for out-of-samples. In terms of robustness of the Sharpe

ratio testing, Model (3) decreased the most value (-0.9435 averagely) while Model (4) increased

14.64% averagely. This results shown that the transaction cost constraints in Model (4) can

improve the solution quality. Overall, portfolio by Model (4) has a best performance for both

in-samples and out-of-samples testing.

Table 4.5: Sharpe ratio for out-of-samples (2008.01 - 2009.01)q Model (1) diff Model (2) diff Model (3) diff Model (4) diff10 -0.1729 -0.2133 -1.3775 -1.3791 -0.256 -0.7463 -0.1849 -0.617320 -0.2583 -0.5662 -1.3775 -1.8248 -0.3024 -0.9519 -0.2799 -0.575130 -0.2556 -0.6153 -0.2192 -0.7568 -0.2352 -1.0393 -0.2015 -0.591440 -0.2548 -0.6415 -0.2573 -0.8099 -0.2617 -1.1515 -0.2031 -1.038550 -0.2585 -0.6616 -0.3269 -0.9038 -0.3142 -1.1285 -0.1878 -0.790960 -0.3387 -0.7742 -0.3415 -0.8627 -0.3534 -0.9385 -0.3452 -1.110470 -0.3396 -0.709 -0.2909 -0.9479 -0.3604 -1.2188 -0.3467 -1.156780 -0.3114 -0.6844 -0.4219 -1.2016 -0.4345 -1.3457 -0.235 -0.99290 -0.3104 -0.7162 -0.4018 -1.1504 -0.4153 -1.2943 -0.2452 -1.0522100 -0.3172 -0.715 -0.4035 -1.1524 -0.3157 -1.2375 -0.2013 -1.0297

Aver. -0.2817 -0.6297 -0.5418 -1.0989 -0.3249 -1.1052 -0.2431 -0.8954

Table (4.5) lists the Sharpe ratios for out-of-samples during 2008.01 - 2009.01, a main period

during the financial crisis. We can see that all instances have negative value, which indicate the

portfolio has underperformance during the out-of-samples. Model (2) has the most negative

value of Sharpe ratio for out-of-sample testing, while Model (4) has the smallest negative value.

This shown that the portfolio by Model (4) has relative better performance. The difference


value by Model (1) looks better than other models because it generates lower Sharpe ratios for

in-samples (See subfigure on Figure 4.7) and associated difference may be low after subtraction.

Table 4.6: Sharpe ratio for out-of-samples (2009.01 - 2010.01)q Model (1) diff Model (2) diff Model (3) diff Model (4) diff10 1.005 1.4944 0.931 1.6317 1.0096 1.7649 0.9782 1.482920 0.9952 1.4708 0.9963 1.6766 1.0302 2.1275 0.9912 1.635730 0.9317 1.4424 0.931 1.6527 1.0369 2.0057 1.0234 1.629640 0.9087 1.4153 0.8945 1.6096 0.9784 2.0338 0.9532 1.677550 0.893 1.3843 0.947 1.6813 0.9579 2.0423 0.9731 1.700460 0.9443 1.4546 0.8839 1.623 0.9669 2.0233 0.9925 1.724670 0.9628 1.4775 0.8567 1.5229 0.9563 2.0442 1.0132 1.722880 0.9373 1.4588 0.9039 1.5869 0.942 1.9765 0.9972 1.723790 0.932 1.4478 0.8714 1.537 0.9425 1.9457 1.0258 1.7581100 0.9029 1.3771 0.919 1.5531 0.9406 1.9783 1.0385 1.7357

Aver. 0.9413 1.4423 0.9135 1.6075 0.9761 1.9942 0.9987 1.6791

Table (4.6) shows the Sharpe ratio of out-of-samples at period of 2009.01 – 2010.01. Again

Model (4) generated largest average ratio values and Model (3) had a largest difference value.

These instances have shown the benefit of the sector limit constraint for out-of-sample testing,

i.e. the sector limit constraint can improve the portfolio’s Sharpe ratio values.

Table 4.7: Sharpe ratio for out-of-samples (2011.01 - 2011.06)q Model (1) diff Model (2) diff Model (3) diff Model (4) diff10 0.7713 0.0471 1.1298 0.6921 0.4465 -1.5184 0.9845 0.271720 0.8116 0.0914 1.1298 0.811 0.5405 -1.3674 0.9125 0.346730 0.7792 0.1156 0.6981 0.3241 0.603 -1.2403 0.6983 0.293340 0.5514 -0.1996 0.8016 0.3052 0.4781 -1.7064 0.7322 0.256150 0.8077 0.0491 0.7605 0.3598 0.465 -1.562 0.7193 0.203360 0.6495 -0.0487 0.6087 0.0183 0.7241 -1.1084 0.6824 0.168970 0.6209 -0.1324 0.574 0.0173 0.9563 -0.9827 0.9863 0.538280 0.6763 -0.0793 0.674 0.1751 0.6803 -1.1866 0.6926 0.270690 0.6638 -0.1852 0.6674 0.1392 0.6685 -1.3072 0.6651 0.1645100 0.6988 -0.1445 0.6934 0.1597 0.6958 -1.3081 0.7062 0.2426

Aver. 0.703 -0.0487 0.7737 0.3002 0.6258 -1.3288 0.7779 0.2756

Finally we test the Sharpe ratio for out-of-samples at period of 2011.01 – 2011.06 in Table

(4.7). We see that the average Sharpe ratio of Model (4) and Model (2) are close to each

other, but better than the average value from Model (3). Meanwhile the difference value by

Model (1) and (3) are negative, and Model (3) has the largest average difference. Here we point

out that for some data structure, transaction cost and turnover constraints can determine a

portfolio with good average performance (see Model (2) in Table (4.7)) while for some other


data structure, sector limit constraint generate a better portfolio, e.g. the Model (3) in Table

(4.4). Model (4) generally have good performance for out-of-sample testing as seen Table (4.4)

to Table (4.7) since it incorporate both transaction cost and sector limit constraint sets.

Similar to the definition in Cornuejols and Tutuncu [40], we calculated the tracking ratio

by following formula:

R0t =

∑ni=1 Vit/

∑ni=1 Vi0∑q

j=1wjVjt/∑q

j=1wjVj0

where∑ni=1 Vit∑ni=1 Vi0

indicates the target index’s movement after investment,∑qj=1 wjVjt∑qj=1 wjVj0

denotes the

portfolio’s performance during the out-of-sample period. The ideal tracking ratio, R0t, is 1, a

higher value over than 1 means underperformance with respect to the target index, and a lower

value less than 1 indicates excessive return. The straight line indicates the portfolio perfectly

tracked the market index, S&P500. The out-of-sample periods were tested where the durations

are 6 months and 12 months respectively, there was no re-balance during the tracking period

after investment.

Figure 4.8: Comparison of Performance – Tracking Ratio of out-of-sample period (2007, 2008)


Figure 4.9: Comparison of Performance – Tracking Ratio of out-of-sample period (2009, 2011)

Figure (4.8) and Figure 4.9 displayed the out-of-sample tracking ratios for four periods. As

shown in Figure (4.8), the tracking portfolios might have a better tracking performance in the

near future (6 months) than the longer future (12 months). For example, all portfolios were

superior to the market index during 2007.1-2007.6, i.e. all R0,6 < 1, while some portfolios

had underperformance than the market during 2007.1-2007.12, i.e. all R0,12 > 1. Another

observation was that the models with the sector limit constraint (lines of model (3) and model

(4)) had a stable performance than the models without sector limit constraint (lines of model

(1) and model (2)).

Taken together, the analysis from Figure 4.4 - 4.9 provided important insights into the

portfolio performance under different restriction. The sector limit constraint, as a global re-

striction, can change the solution structure so that the objective value changed significantly.

As a result, the performance of portfolio with sector limit was better than the corresponding

portfolio without sector limit in terms of the comparison of the portfolio variances and Sharp

ratios.


4.6 Conclusions and Discussion

In this chapter we have investigated portfolio tracking models that are linear mixed integer

optimization problems that represent a constrained clustering approach for tracking a bench-

mark index, in particular the S&P 500. Motivated by real investment cases transaction costs

and sector limits constraints were added to a base clustering model. We then developed both

a Lagrangian Relaxation (LR) algorithm and the partial Semi-Lagrangian Relaxation (SLR)

algorithm to solve the tracking problem with constrains. Numerical results have shown that

both of the methods can achieve high quality solutions. Through the computational results

we observe: (1) the sector limit constraint can diversify stocks into different sectors and then

reduce the portfolio variance efficiently; (2) the optimal sector weights are consistent to the

sector weights of the target index if the sector limit constraint is incorporated. In general, the

constrained clustering approach tracked the S&P 500 effectively and the models and methods

in this chapter can be used to effectively track any market index.

Chapter 5

Progressive Hedging for Cardinality

Constrained Financial Planning

Problem

In Chapter 4, we explored Lagrangian relaxation algorithms for index tracking problem. Index

tracking is a prevalent passive investing strategy to emulate the movement of the market indices.

It provides an optimization tool for choosing a limited number of assets to represent a target

index. We concluded that cardinality constraint is important for portfolio selection because

practically, it can improve the model to suit more requirements, and theoretically, it may

change the property of the problem. In this chapter we will study the Financial Planning

problem with cardinality constraint.

5.1 Introduction to Financial Planning Problem

Financial Planning (FP) problem is a portfolio selection process that achieves specific goals

with limited resource. For example, the pension funds involve revision of portfolio investment

to maximize profit and meet liabilities over time periods. For instance, the arbitrage trade in

currency market can be formulated through a network in the form of a loop, and the arbitrage

opportunity can be detected if the product of the multipliers for arcs on the loop is greater

than unity. Other examples include asset allocation for portfolio selection and international

66

Chapter 5. Progressive Hedging for Cardi. Constrained FP 67

cash management in [115]. Cash (decision variable) flows on the arcs between different nodes,

and transaction cost accumulates if better nodes are invested. The benefit of the FP problem

with a network structure is that the decision process is straightforward and visible.

Taking uncertainty into consideration is critical in FP problem. Stochastic programming

(SP) is a popular tool to prevent the uncertainty of the parameters such as expected return

of the asset in FP model. Uncertainty is fixed in the first decision stage, and recourse action

is allowed at some cost to restore the feasibility after a realization of uncertainty is observed

in next decision stage. Compared with other strategies such as robust optimization for the

quantification of the uncertainty for model parameters, SP can return a solution trade-off

between different scenarios. However, the problem size increases exponentially with respect to

the time period, asset set and scenario number. The FP problem is formulated as a Linear

Programming (LP) and the authors solve the model by progressive hedging algorithm which

has a linear convergence rate.

When we tested the FP problem in [115], we found that in some instances optimal portfolio

allocations often are concentrated in a few assets which may result in high portfolio variance,

while in some other instances the optimal portfolio distributes across a large range of assets

which will result in high transaction costs. The potential disadvantages motivated us to incor-

porate the cardinality constraint to improve the model. The main contributions of this chapter

include:

� We developed the FP problem (LP) into a Stochastic Mixed Integer programming (SMIP),

and decomposed the associated SMIP across different scenario.

� Lagrangian relaxation and Tabu search methods were used to solve the scenario sub-

problem, and the numerical result showed that our sub-solver reduce the solving time

efficiently compared with the time information by Gurobi.

� Progressive Hedging Algorithm was applied for SMIP and instances with large scenario

number (S = 75) were tested. Moreover, a Lagrangian lower bound was embedded into

the PH method and better gap information was obtained compared with the gap by

Gurobi.

The rest of the chapter is organized as follows: In Section 5.2, we formulate a series of equiv-

alent Financial Planning problem. In Section 5.3, we decompose the FP problem correspond to


scenarios and design Lagrangian relaxation and Tabu search methods for scenario sub-problem.

Section 5.4 describes details of the Progressive Hedging Algorithm. We also generate, in this

section, a lower bound for the problem. In Section 5.5, we extend the FP framework into index

tracking problem and present additional numerical result. The final section summarizes the

current work and proposes possible areas in need of further research.

5.2 Model Development

5.2.1 Equivalent Cardinality Constrained FP Models

We develop the network structure Financial Planning problem in [115] by adding cardinality

constraint. Suppose that K assets are selected from the asset set N where Cardi(N ) = N .

Figure (5.1) shows the network structure of financial planning as a 0−1 stochastic programming.

At first stage we can pump initial budget bi to each node and choose the initial cardinality

number for the asset set. At the second stage, we rebalance the portfolio under different

scenarios. We apply progressive hedging to force all arcs in different scenarios at stage 0 into a

unique arc.

Figure 5.1: Network structure with cardinality at stage 0 and 1

We first describe the parameters and decision variables relative to above figure and our

model as follows:

� Parameters:


– bi > 0 denotes the initial investment to the node i ∈ N .

– cij > 0 denotes transaction cost ratio on arc (i, j) where i 6= j ∈ N at stage 0.

– Rsi denotes the total return Rsi to the asset i under scenario s.

– csij > 0 denotes transaction cost ratio on arc (i, j) where i 6= j ∈ N at stage 1 under

scenario s.

– ps denotes the probability that the scenario s may occur at at stage 1.

� Decision variables:

– xij = amount of cash flow on the arc (i, j) at stage 0. If xii > 0 means assets i is

selected and be directed to the portfolio. If xii = 0 => ysij = 0, ∀i ∈ N , ∀(i, j) ∈

As,∀s ∈ S, Note that xij is the initial wealth come out from node i, it will be scaled

by cij when it goes into node j.

– gi = 1 denotes asset i is chose at stage 0, 0 otherwise. If gi = 1 means the value of

node i is bounded by a lower and upper bound. If gi = 0 => xii = 0, then there is

no value to node i which can be switched into next stage, and there is no any arc

come from node i at stage 1 (see Figure 5.1).

– ysij = the amount of investment flow on the arc (i, j) under scenario s.

Then we formulate the whole problem as follows:

min∑

(j,i)∈A0

cijxij −∑s∈S

ps

∑(i,j)∈A1s

(1− csij

)ysij

(5.1)

s.t. bi +∑

(j,i)∈A0,j 6=i

xji ≥∑

(i,j)∈A0

xij ,∀i ∈ N (5.2)

lxi gi ≤ xi ≤ uxi gi, ∀i ∈ N (5.3)∑i∈N

gi = K (5.4)

gi ∈ {0, 1} , ∀i ∈ N (5.5)

Rsixi +∑

(j,i)∈A1s,j 6=i

(1− csji

)ysji ≥

∑(i,j)∈A1s

ysij ,∀i ∈ N ,∀s ∈ S (5.6)

lysij gi ≤ ysij ≤ u

ysij gi,∀i ∈ N ,∀ (i, j) ∈ A1s, ∀s ∈ S (5.7)


where the arc set A0 and A1s denote the network at stage 0 and network at stage 1 under

scenario s respectively. Both arc set A0 and A1s include all arcs between the node i and j,

they also include an arc between i0 and i1 which connect different stage. lxi , uxi are the lower

and upper bound flow to xi on node i ∈ N . lysij , uysij are the lower and upper bound flow to ysij

on arc (i, j) ∈ A1s for any scenario s.

The objective of model (5.1) - (5.7) means we minimize the total transaction cost at stage

0 and maximize the total expected net wealth of the network at stage 1. The constraint (5.2)

means for any node i, the total cash flow in no less than the total cash flow out at stage 0,

constraint (5.3) - (5.5) denote the cardinality number of the portfolio, if some gi = 0, then the

node value is forced to 0 by constraint (5.3). Constraint (5.6) means the total cash flow out

from the fixed network cannot exceed the total cash flow in since if any node is unselected at

stage 0, there is no transaction arc to any other nodes from the unselected node at stage 1,

which is bounded by constraint (5.7).

There exist a remarkable difference between the objective functions of the proposed model

(5.1) - (5.7) and the FP model in [115], that is, we move the transaction ratio cij and csij into

the objective function and change the equality sign into inequality sign in constraint 5.2. The

main advantage of this operation for the transaction ratio is that cij can be used to adjust

the penalty coefficient during iterations in the Progressive Hedging framework. However, we

need to test if this change significantly affect the objective value. We displayed the comparison

results that show how close between the two models in Table 5.1. We run 26 instances and all

instances were obtained the optimal solution by using Gurobi mixed integer solver. 22 pairs

of instances had the same objective value with optimal positions (see the last column). The

average difference is 0.82%, which indicates the modified model does not change significantly if

we add the transaction cost into the objective function.

We decompose the model (5.1) - (5.7) across different scenarios by studying the non-

anticipativity constraints. The left side of Figure 5.2 shows a simple scenario tree with 3

stages and 2 time period, therefore, the total scenario number |S| = 32 = 9. Now if we split the

scenario tree to the right side of Figure 5.2, and force the variables in brackets are same, which

means the variables in brackets followed the same historical path, then these two scenario trees

are exactly same. This type of constraint is called Non-anticipativity constraint in [130].


Table 5.1: Model Comparison - with and without transaction cost term

(N,S) KModel (5.1) - (5.7)

no trans. termModel (5.1) - (5.7) obj diff

col.6 - col.4relativeobj diff

normobj diff

Best LB Fesi UB Best LB Fesi UB

(10, 3)

1 -943.52 -943.52 -924.52 -924.52 19 2.01% 02 -990.59 -990.59 -991.86 -991.86 -1.27 -0.13% 03 -995.18 -995.18 -996.29 -996.29 -1.11 -0.11% 04 -998.33 -998.33 -999.28 -999.28 -0.95 -0.10% 05 -1001.37 -1001.37 -1002.17 -1002.17 -0.8 -0.08% 0

(10, 15)

1 -940.73 -940.73 -919.83 -919.83 20.91 2.22% 02 -960.71 -960.71 -960.74 -960.74 -0.03 0.00% 03 -965.28 -965.28 -965.31 -965.31 -0.03 0.00% 04 -968.34 -968.34 -968.36 -968.36 -0.02 0.00% 05 -970.58 -970.58 -970.6 -970.6 -0.02 0.00% 0

(50, 3)

5 -4719.76 -4719.76 -4630.41 -4630.36 89.41 1.89% 210 -4935.13 -4935.13 -4940.27 -4940.27 -5.14 -0.10% 015 -4957.48 -4957.48 -4962.05 -4962.05 -4.57 -0.09% 020 -4977.83 -4977.83 -4981.75 -4981.75 -3.92 -0.08% 0

(50, 15)

5 -4704.1 -4704.1 -4599.19 -4599.19 104.91 2.23% 1.414210 -4814 -4814 -4814.14 -4814.14 -0.14 0.00% 015 -4837.5 -4837.5 -4837.65 -4837.65 -0.15 0.00% 020 -4853.56 -4853.56 -4853.78 -4853.78 -0.22 0.00% 0

(100, 3)

5 -4862.81 -4862.81 -4637.81 -4637.81 225 4.63% 010 -9472.7 -9472.7 -9257.13 -9257.13 215.57 2.28% 2.828415 -9864.41 -9864.41 -9873.07 -9872.21 -7.79 -0.08% 220 -9889.64 -9889.64 -9899.73 -9899.73 -10.09 -0.10% 0

(100, 15)

5 -4837.44 -4837.44 -4612.44 -4612.44 225 4.65% 010 -9428.81 -9428.81 -9214.66 -9214.66 214.15 2.27% 015 -9618.17 -9618.17 -9619.13 -9619.13 -0.96 -0.01% 020 -9641.84 -9641.84 -9643.04 -9643.04 -1.21 -0.01% 0

Average -4467.3 -4467.3 -4425.97 -4425.93 41.37 0.82% 0.317

Figure 5.2: Equivalent scenario trees


For our two stage FP problem, the first stage decision variables (xij , gi)T can be split into

(xsij , gsi )T across scenario s, then the model (5.1) - (5.7) can be reformulated as the following

equivalent problem:

min∑s∈S

ps

∑(i,j)∈Aos

cijxsij −

∑(i,j)∈A1s

(1− csij

)ysij

(5.8)

s.t. bi +∑

(j,i)∈A0s,j 6=i

xsji ≥∑

(i,j)∈A0s

xsij ,∀i ∈ N ,∀s ∈ S (5.9)

lxi gsi ≤ xsi ≤ uxi gsi ,∀i ∈ N ,∀s ∈ S (5.10)∑

i∈Ngsi = K,∀s ∈ S (5.11)

gsi ∈ {0, 1} ,∀i ∈ N ,∀s ∈ S (5.12)

xsij = xij ,∀ (i, j) ∈ A0s, ∀s ∈ S (5.13)

gsi = gi,∀i ∈ N ,∀s ∈ S (5.14)

Rsixsi +

∑(j,i)∈A1s,j 6=i

(1− csji

)ysji ≥

∑(i,j)∈A1s

ysij ,∀i ∈ N ,∀s ∈ S (5.15)

lysij gsi ≤ ysij ≤ u

ysij g

si , ∀i ∈ N , ∀ (i, j) ∈ A1s,∀s ∈ S (5.16)

where xij =∑

s∈S psxsij , ∀ (i, j) ∈ A0s, ∀s ∈ S and gi =

∑s∈S p

sgsi , ∀i ∈ N ,∀s ∈ S. Two

networks A0s and A1s are running separately for any scenario now, and the non-anticipativity

constraints (5.13) and (5.14) force every A0s into A0, which makes the model (5.8) - (5.16)

is equivalent the model (5.1) - (5.7). The non-anticipativity constraints (5.13) and (5.14) is

crucial for solving the problem (5.8) - (5.16) since it connect the split variable at stage 0.

5.2.2 Scenario Generation

We generate two types of scenario for different parameters in the developed models, i.e. trans-

action cost ratio csij on the arc (i, j) and node expected return Rsi . We reasonably assume

the transaction cost ratio is deterministic and can be predicted in the near future, then cij

can be assigned to for every postulated economic scenario. For example, assume that current

transaction cost ratio is 5%, if the market goes up too quick, then the market regulator will

increase the transaction cost ratio to 8% to cool the market; if the market plunge rapidly, then

the government will decrease the transaction cost ratio to 2% to stimulate the trading activ-


ities; otherwise, the transaction cost ratio will keep as 5%. In our model, we assign different

transaction cost ratios for different market stages in Table 5.7.

Obtaining the discrete outcomes Rsi for node i in future is more difficult and various tech-

niques have evolved for generating scenarios for stochastic programs [82, 70, 122, 76]. Hoyland

and Wallace [82] presented a moment matching method to obtain discrete outcomes whose s-

tatistical properties are as close as possible to the specified distribution. Define K to be the

set of all specified statistical properties and SV ALk to be the value of the specified property

k ∈ K. For example, statistical property can be expressed moments information such as mean,

variance/covariance, skewness (third central moment) and kurtosis (fourth central moment)

from observations. Let fk (x, p) denote the mathematical expression about statistical property

k in terms of x and p. Therefore, the model is given by

minx,p

∑k∈K

wk (fk (x, p)− SV ALk)2 (5.17)

s.t.∑

l∈Ltpl = 1, ∀t = 1, · · · , T (5.18)

pl ≥ 0, ∀l ∈ Lt, t = 1, · · · , T (5.19)

where wk is the weight of statistical property k. That is, an optimization problem is formulated

to minimize the norm distance between the statistical properties of the constructed tree and

those specified by the decision maker. The main advantage of this method is that it can capture

any moment of the new series of data which consist of historical price and the aggregation of

all possible movements of the node.

The moment matching method has been developed by extensive research [82, 95, 105, 70].

In many cases, the moment matching method requires the distribution or description of the

functions of marginal, which is not easy. In this section, we present a revised moment matching

method which does not require the property of marginal as one whole optimization program.

Our model captures mean, variance and covariance between the assets since these moments

are the most important statistical specifications. Assume that the history return vector hi,[0,s]

which represents the return of security i in the past s periods is observed at the current time,

and scenario tree for T time periods need to be built in future. At any time point t ∈ [1, T ] for


any asset i, the first two central moments can be calculated as follows:

E(uti)

=∑Lt

l=1plx

til,∀i,∀t (5.20)

ui,[0,s+T ] =sui,[0,s] +

∑Tt=1 E

(uti)

s+ T − 1, ∀i (5.21)

(s+ T − 1) vari,[0,s+T ] =∑s

m=1

(hi,m − ui,[0,s+T ]

)2+∑T

t=1

(E(uti)− ui,[0,s+T ]

)2, ∀i (5.22)

(s+ T − 1) covarij,[0,s+T ] =∑s

m=1

(hi,m − ui,[0,s+T ]

) (hj,m − uj,[0,s+T ]

)(5.23)

+∑T

t=1

(E(uti)− ui,[0,s+T ]

) (E(utj)− uj,[0,s+T ]

),∀ {(i, j) |i 6= j}

lbtil ≤ xtil ≤ ubtil,∀i,∀l ∈ Lt, t = 1, · · · , T (5.24)

where constraint (5.20) denotes the expected return of asset i at time period t, constraint (5.21)

refers to the first central moment of asset i with addition of new T periods, and constraints

(5.22) and (5.23) calculate the second central moments, i.e. covariance matrix, for asset i in

new time series. constraint (5.24) denotes the boundary conditions of variable xtil for asset i at

scenario l in time period t.

One primary issue for scenario generation is that the existence of the arbitrage opportunity

which may lead to an unrealistic decision. Klaassen proposed an approach to detect and exclude

the arbitrage opportunities through the dual argument, and numerical examples are shown in

his work [89]. Since arbitrage opportunities may also exist in the scenario set generated by

(5.20) - (5.24), follow the same argument process in [89], we preclude the arbitrage scenarios

by adding following dual constraints:

πt0 −∑Lt

l=1πtlx

til =

∑Lt

l=1xtil,∀i,∀t (5.25)

πtl ≥ 0,∀l ∈ Lt, t = 1, · · · , T (5.26)

∑Lt

l=1θtl(1 + xtil

)= 1,∀i,∀t (5.27)

θtl ≥ 0,∀l ∈ Lt, t = 1, · · · , T (5.28)

where π, θ are the dual variable vectors for 2 types of individual arbitrage opportunities de-

scribed in [89], constraints (5.25) - (5.26) deal with the case where the possible non-negative

payoff with zero investment, while constraints (5.27) - (5.28) handle the case where you can

obtain some reward immediately without any risk in future.


We assign a weight vector (wi1, wi2) for the first and second central moments respective-

ly, and minimize the statistical properties’ distance between the constructed distribution and

specification. Then we formulate the overall optimization problem:

minx,p,π,θ

∑N

i=1

[wi1(ui,[0,s+T ] − ui,[0,s]

)2+ wi2

(vari,[0,s+T ] − vari,[0,s]

)2]+∑N

i=1

∑N

j=1,i 6=j√wi2wj2

(covarij,[0,s+T ] − covarij,[0,s]

)2(5.29)

s.t. 5.20 - 5.28

We do not need to describe the prosperities of marginal since they are dependent variables

in the system (5.29). The parameter, (wi1, wi2), can be expressed as decision maker’s attitude

about future. For example, setting the ratio wi2/wi1 = 1/10 denotes the first moment is 10

times important than the second moment for asset i, and vice versa. We apply the proposed

model (5.29) to generate scenarios for Rsi via employing historical market data in Section 5.4.3.

5.3 Lagrangian Decomposition Scheme

Solving model (5.8) - (5.16) is difficult because (I) the problem size increase quickly with

respect to the size of network and scenario number; (II) the model includes binary variables

which destroy the convex property of the problem. However, high-quality solutions for large

scale instances can be obtained through Progressive Hedging, a specific Lagrangian technique,

that handle with the non-anticipativity constraint.

We relax non-anticipativity constraints (5.13) and (5.14) by assigning Lagrangian multiplier

λsij and πsi respectively, and add the proximal term by penalty ρ, and then we formulate the

augmented Lagrangian:

ALAG (x, g, y, λ, π) =∑s∈S

ps

∑(i,j)∈Aos

cijxsij −

∑(i,j)∈A1s

(1− csij

)ysij

+∑s∈S

ps

∑(i,j)∈Aos

λsij(xsij − xij

)+ρ

2

∑(i,j)∈Aos

(xsij − xij

)2+∑s∈S

ps

(∑i∈N

πsi (gsi − gi) +ρ

2

∑i∈N

(gsi − gi)2

)

=∑s∈S

ps

∑(i,j)∈Aos

(cij + λsij − ρxij

)xsij +

ρ

2

(xsij)2


+∑s∈S

ps

(∑i∈N

(0 + πsi − ρgi +

ρ

2

)gsi

)

−∑s∈S

ps

∑(i,j)∈A1s

(1− csij

)ysij

+4

where 4 =∑

(i,j)∈Aosρ2 (xij)

2 − λsijxij +∑i∈N

(ρ2 − π

si

)gi. We simplify the second quadratic term

(gsi − gi)2 = (gsi )

2 − 2gsi gi + (gi)2 = gsi − 2gsi gi + gi, for binary variable gsi ∈ {0, 1}.

ALAG (x, g, y, λ, π) is non-separable across scenario because of the cross product xijxsij and

gigsi in the quadratic terms, however, if we fixed the xij and gi by previous iterative solution,

i.e. xv−1ij and gv−1

i , the problem become fully decomposed within iteration v:

ALAGPH(x, g, y, λ, π, xv−1, gv−1

)=

∑s∈S

ps

∑(i,j)∈Aos

(cij + λsij − ρxv−1

ij

)xsij +

ρ

2

(xsij)2

+∑s∈S

ps

(∑i∈N

(0 + πsi − ρgv−1

i +ρ

2

)gsi

)

−∑s∈S

ps

∑(i,j)∈A1s

(1− csij

)ysij

+4

For each scenario s at iteration v, setting Csij = cij+λsij−ρxv−1ij and F si = 0+πsi −ρg

v−1i + ρ

2 ,

then the problem can be decomposed into following scenario sub-problems which maintain the

FP structure:

min∑

(i,j)∈AosCsijx

sij +

ρ

2

(xsij)2

+∑i∈N

F si gsi −

∑(i,j)∈A1s

(1− csij

)ysij (5.30)

s.t. bi +∑

(j,i)∈A0s,j 6=i

xsji ≥∑

(i,j)∈A0s

xsij ,∀i ∈ N ,∀s ∈ S (5.31)

lxi gsi ≤ xsi ≤ uxi gsi ,∀i ∈ N ,∀s ∈ S (5.32)∑

i∈Ngsi = K,∀s ∈ S (5.33)

gsi ∈ {0, 1} ,∀i ∈ N ,∀s ∈ S (5.34)

Rsixsi +

∑(j,i)∈A1s,j 6=i

(1− csji

)ysji ≥

∑(i,j)∈A1s

ysij ,∀i ∈ N ,∀s ∈ S (5.35)

lysij gsi ≤ ysij ≤ u

ysij g

si , ∀i ∈ N , ∀ (i, j) ∈ A1s, ∀s ∈ S (5.36)

The time for solving model (5.30) - (5.36) is non-trivial for proposed Progressive Hedging

method in Section 5.4. However, one observation is that the current advanced solver consumed


long time to deal with scenario sub-problem with respect to large asset number N (see Table

5.2). We design two methods, Lagrangian Relaxation and Tabu Search, to speed up the process

of solving scenario sub-problem in Section 5.3.1 and Section 5.3.2.

5.3.1 LR method for scenario sub-problem

Observe that binary variable gi connect both variable xij and yij in constraint (5.32) and (5.36),

so one strategy is to relax constraint (5.32) and (5.36). The relaxed problem will become much

easier than the model (5.30) - (5.36) since the relaxed problem can be separated into continuous

part (only xij and yij) and integer part (only gi), also we reduce the number of constraints

dramatically, which can save much of time for coefficient matrix construction.

We assign ωs−i ≥ 0 and ωs+i ≥ 0 to constraint lxi gsi −xsi ≤ 0 and −uxi gsi +xsi ≤ 0 respectively

and assign θs−ij ≥ 0 and θs+ij ≥ 0 to constraint lysij gsi − ysij ≤ 0 and −uysij gsi + ysij ≤ 0 respectively.

Then we construct following Lagrangian objective for subproblem:

sub LR (x, g, y, ω, θ)

=∑

(i,j)∈AosCsijx

sij +

ρ

2

(xsij)2

+∑i∈N

F si gsi −

∑(i,j)∈A1s

(1− csij

)ysij

+∑i∈N

(ωs−i (lxi g

si − xsi ) + ωs+i (−uxi gsi + xsi )

)+

∑(i,j)∈A1s

(θs−ij

(lysij g

si − ysij

)+ θs+ij

(−uysij g

si + ysij

))=

∑(i,j)∈Aos

(Csij + diag

(ωs+i − ω

s−i

))xsij +

ρ

2

(xsij)2

+∑i∈N

F si + lxi ωs−i − u

xi ω

s+i +

∑j∈N

(lysij θ

s−ij − u

ysij θ

s+ij

) gsi+

∑(i,j)∈A1s

(csij + θs+ij − θ

s−ij − 1

)ysij

As can be seen, the above sub LR (x, g, y, ω, θ) can be decomposed into two separated parts.

The first part is a smaller size linear integer programming, sub LR IP (g, ω, θ):

min∑i∈N

F si + lxi ωs−i − u

xi ω

s+i +

∑j∈N

(lysij θ

s−ij − u

ysij θ

s+ij

) gsi (5.37)

s.t.∑i∈N

gsi = K,∀s ∈ S (5.38)


gsi ∈ {0, 1} , ∀i ∈ N ,∀s ∈ S (5.39)

Solve the model (5.37) - (5.39) is not hard, we just sort the coefficient of objective and choose

the first K nodes with the minimal coefficient. The second part can be seen as a quadratic

programming, sub LR QP (x, y, ω, θ):

min∑

(i,j)∈Aos

(Csij + diag

(ωs+i − ω

s−i

))xsij +

ρ

2

(xsij)2

+∑

(i,j)∈A1s

(csij + θs+ij − θ

s−ij − 1

)ysij

(5.40)

s.t. bi +∑

(j,i)∈A0s,j 6=i

xsji ≥∑

(i,j)∈A0s

xsij , ∀i ∈ N , ∀s ∈ S (5.41)

Rsixsi +

∑(j,i)∈A1s,j 6=i

(1− csji

)ysji ≥

∑(i,j)∈A1s

ysij , ∀i ∈ N , ∀s ∈ S (5.42)

The solution from model (5.40) - (5.42) under gsi from model (5.37) - (5.39) is a feasible

solution to model (5.30) - (5.36), then a high quality upper bound is constructed at each iteration

with the solution information from (5.37) - (5.42). The dual problem of sub LR (x, g, y, ω, θ) is

a convex problem, maxω≥0,θ≥0 minx,g,y sub LR (x, g, y, ω, θ) which can be efficiently solved.

The dual problem returns either an optimal solution or maximal lower bound to model

(5.30) - (5.36). The pseudocode of LR algorithm is displayed in Appendix B. 1 for further

interesting reading. We present the numerical comparison between the LR method and Gurobi

in Table 5.2. From Table 5.2 we see that the solution of LR method is close to the solution

from Gurobi, the worst gap is 2.68% (s = 9) and the average difference is 1.38%. However, the

average solving time of LR method (92.97 seconds) is only half of the solving time by Gurobi

(178.94 seconds). This is a tradeoff between the solving time and accuracy of the solution,

we save the solving time for sub-problems so that we can speed up the Progressive Hedging

strategy later. More numerical instances listed in Table B.1 - B.3 in Appendix B. 1 have shown

that our LR method can significantly reduce the solving time while keeping the high quality of

the solutions.


Table 5.2: LR method and Gurobi Comparison - instance 1Scenariosubcase

Gurobi LR MethodGap toGurobi

N=50K=5S=15

Best LB Feasi. UB GapTime(S)

Best LB Feasi. UB GapTime(S)

col (7-3)./col 3

s=1 -23383 -11146.8 109.77% 174.7 -31029.9 -11054.2 180.71% 103.58 0.83%

s=2 -22770.9 -10494.5 116.98% 169.69 -30853.6 -10368 197.59% 84.66 1.21%

s=3 -22853.7 -9833.58 132.40% 166.67 -30679.6 -9710.75 215.93% 97.07 1.25%

s=4 -23400.4 -9220.24 153.79% 179.62 -30454.3 -9143.6 233.07% 82.48 0.83%

s=5 -21833.9 -8583.29 154.38% 178.42 -30218.3 -8465.56 256.96% 104.04 1.37%

s=6 -22249.1 -7963.5 179.39% 178.3 -31253.2 -7893.39 295.94% 107.24 0.88%

s=7 -21698.3 -7337.11 195.73% 192.13 -30991.5 -7242.4 327.92% 95.6 1.29%

s=8 -21757.2 -6926.54 214.11% 182.6 -30791.6 -6809.67 352.18% 89.96 1.69%

s=9 -20627.5 -6750.48 205.57% 166.37 -30705.7 -6569.29 367.41% 105.49 2.68%

s=10 -20283.3 -6654.9 204.79% 153.9 -30777.8 -6563.69 368.91% 81.77 1.37%

s=11 -20217.8 -6668.93 203.16% 170.43 -32889.3 -6587.47 399.27% 84.29 1.22%

s=12 -19783.3 -6604.64 199.54% 193.03 -32816.1 -6478.4 406.55% 90.18 1.91%

s=13 -20523.5 -6472.14 217.10% 177.28 -32802 -6392.12 413.16% 90.1 1.24%

s=14 -19784.2 -6380.22 210.09% 179.83 -32791.1 -6239.24 425.56% 88.02 2.21%

s=15 -18326.4 -6302.64 190.77% 221.11 -32739.2 -6254.81 423.43% 90.05 0.76%

Aver. - - 179.17% 178.94 - - 324.30% 92.97 1.38%

5.3.2 Tabu search for scenario sub-problem

The objective value of model (5.30) - (5.36) can be divided into 3 parts:

� Z (x) =∑

(i,j)∈Aos Csijx

sij + ρ

2

(xsij

)2where Csij is the transaction cost ratio on arc (i, j) at

stage 0 under scenario s. The formula tells us that we should open the arc with negative

Csij and reduce the flow on the arc with positive Csij as much as possible. For a specific

arc (i, j) with negative Csij , we know that:

Z(xsij) =

Csijbi + ρ2(bi)

2, if Csij < −ρbi

−(Csij)2

2ρ , if − ρbi ≤ Csij < 0

Z(xsij) will be used to the swap process because open an arc (i, j) with a negative Csij

indicates the node j will be selected as a searched candidate.

� Z (g) =∑∀i∈N F

si g

si where F si is the transaction cost on arc

(i0, i1

)from stage 0 to stage

1 under scenario s, we will compare the ratioF siRsi

to determine a better node.

� Z (y) = −∑

(i,j)∈A1s

(1− csij

)ysij where csij is the transaction cost ratio on arc (i, j) at

stage 1 under scenario s.


For a given g satisfy (5.33) and (5.34), an optimal flow x∗ (g) is determined by minimizing

Z (x) subject to (5.31) and (5.32) and then a corresponding optimal flow y∗ (x∗, g) can be

determined by maximizing Z (y) subject to (5.35) and (5.36). Thus a current feasible point

(x∗, g, y∗) is fixed. Intuitively we can distribute the first K assets which have the highest

returns to get the initial g. However this may not global optimal because the transaction costs∑(i,j)∈Aos C

sijx

sij or F si may be too high so that the current selection is inefficient. Therefore

we need to swap some assets and move to a better neighbor point (x′, g, y′). Assume that we

swap one asset each time and hope to get a better objective value. The details of the Tabu

heuristic framework is described in Appendix B. 2.

Rather than randomly swap between K and S, we use 3 cases in Step 1 to include the node

which can improve the network structure. In case 1 we close the arcs with high transaction ratio

and open the arcs with low transaction ratio so that the structure in stage 1 can be improved,

and then hopefully it can reduce the objective globally. In case 2, we move to the node with a

higher return because this movement may increase the profit in stage 2 dramatically and can

offset the increased cost in stage 1. Case 3 is the opposite situation of case 2, we move to the

node which may decrease the cost in stage 1 and offset the decreased profit in stage 2.

Compared with the sub-problems in [43], the main difference between the Tabu methods is

that they built cycle-based neighbourhoods and searched the associated γ - residual networks

by Tabu heuristic, while we construct the path-based neighbourhoods and search them. This is

determined by the nature of the problems. For their problem, adding or deleting one arc of the

network will not affect the iterative decisions too much since there are many other alternative

arcs can be chosen, so they build a cycle which is consist of many arcs and evaluate the flow

perturbation of the cycle. For our problem, any arc between different stages will have significant

effects on the objective value, so we must evaluate the path-based neighbourhood rather than

the cycle-based neighbourhood.

We first solve the small size sub-problems (N=50, K=5, S=15) and list the result in the

following Table 5.3. And then increase the problem size N to 100, also we list 2 different

randomly instances with different scenario number in Table 5.4 and 5.5.


Table 5.3: Computational result (N=50, K=5, S=15) - instance 1N=50, Gurobi LR method(iter# = 200) Tabu method(L=5, iter#=10)

K=5,S Best LB Feasi. UBGap(%)

Time(s)

Best LB Feasi. UBGap(%)

Time(s)

Gap toGurobi

Feasi. UBTime(s)

Gap toGurobi

1 -846381.6 -846381.6 0 30.5 -1327032.6 -846381.6 56.79 97.1 0.00% -846381.6 997.8 0.00%2 -846323.3 -846323.3 0 35.6 -1327032.5 -846323.3 56.80 97.1 0.00% -846323.3 969.4 0.00%3 -846242.7 -846242.7 0 34.6 -1327032.5 -846242.7 56.81 97.3 0.00% -846242.7 1039.8 0.00%4 -846172.6 -846172.6 0 39.5 -1327032.5 -846172.6 56.83 96.8 0.00% -846172.6 1000.8 0.00%5 -846077.6 -846077.6 0 37.8 -1327032.4 -846077.6 56.85 97.3 0.00% -846077.6 1066.1 0.00%6 -846137.4 -846137.4 0 42.0 -1327032.4 -846137.4 56.83 96.3 0.00% -846137.4 1024.8 0.00%7 -846067.7 -846067.7 0 46.0 -1327032.4 -846067.7 56.85 96.5 0.00% -846067.7 1107.7 0.00%8 -845994.5 -845994.5 0 44.0 -1327032.2 -845994.5 56.86 97.0 0.00% -845994.5 811.4 0.00%9 -845934.9 -845934.9 0 45.5 -1327032.1 -845934.9 56.87 96.7 0.00% -845934.9 504.6 0.00%10 -845862.5 -845862.5 0 39.7 -1327031.9 -845862.5 56.89 96.4 0.00% -845862.5 562.5 0.00%11 -845896.3 -845896.3 0 44.7 -1327031.2 -845896.3 56.88 96.3 0.00% -845896.3 498.2 0.00%12 -845824.9 -845824.9 0 36.7 -1327031.0 -845824.9 56.89 96.5 0.00% -845824.9 525.8 0.00%13 -845737.0 -845737.0 0 44.9 -1327030.6 -845737.0 56.91 96.5 0.00% -845737.0 518.9 0.00%14 -845686.8 -845686.8 0 54.1 -1327030.2 -845686.8 56.92 96.6 0.00% -845686.8 485.2 0.00%15 -845606.0 -845606.0 0 60.7 -1327029.9 -845606.0 56.93 96.3 0.00% -845606.0 549.5 0.00%

Aver. -845996.4 -845996.4 0 42.4 -1327031.8 -845996.4 56.86 96.7 0.00% -845996.4 777.5 0.00%

From Table 5.3 we see that all methods obtained the optimal solution for each scenario

sub-problem, the running time by Tabu method is generally larger than the LR method.

Table 5.4: Computational result (N=100, K=10, S=3 - instance 2)N=100, Gurobi LR method(iter# = 200) Tabu method(L=5, iter#=10)


Time(s)


Time(s)

Gap toGurobi

Feasi. UBTime(s)

Gap toGurobi

1 -2710068.6 -1157181.1 134.00 6527.1 -3054149.6 -1157387.7 163.88 457.5 -0.02% -1157390.5 562.1 -0.02%2 -2708069.7 -1155062.0 134.00 6295.7 -3054148.0 -1155085.5 164.41 441.9 0.00% -1155088.4 1157.7 0.00%3 -2718771.5 -1154562.3 135.00 4910.4 -3054140.9 -1155205.0 164.38 482.3 -0.06% -1155205.0 742.7 -0.06%

Aver. -2712303.3 -1155601.8 135.00 5911.1 -3054146.2 -1155892.8 164.22 460.6 -0.03% -1155894.6 820.9 -0.03%

Table 5.5: Computational result (N=100, K=10, S=10 - instance 3)N=100, Gurobi LR method(iter# = 200) Tabu method(L=5, iter#=10)


Time(s)


Time(s)

Gap toGurobi

Feasi. UBTime(s)

Gap toGurobi

1 -1900135.4 -1900135.4 0.00 1603.8 -1947611.6 -748046.4 160.36 412.9 60.63% -748066.4 2429.1 60.63%2 -1626176.0 -747934.5 117.00 5461.7 -1947611.3 -747921.5 160.40 412.6 0.00% -747934.5 2855.8 0.00%3 -1647390.9 -730632.4 125.00 5370.6 -1947611.2 -747768.6 160.46 410.1 -2.35% -747792.3 2943.2 -2.35%4 -1588360.3 -730315.8 117.00 5220.9 -1947611.1 -747475.5 160.56 409.2 -2.35% -747485.6 3717.8 -2.35%5 -1676354.3 -730567.6 129.00 4942.1 -1947610.5 -747726.7 160.47 401.6 -2.35% -747734.5 2641.2 -2.35%6 -1605558.5 -725776.0 121.00 5022.4 -1947610.4 -747583.2 160.52 399.4 -3.01% -747601.2 1945.0 -3.01%7 -1617124.3 -708514.8 128.00 5482.3 -1947610.2 -747439.5 160.57 404.4 -5.49% -747451.0 1486.2 -5.50%8 -1623408.6 -715362.9 127.00 5065.9 -1947610.3 -747281.2 160.63 402.9 -4.46% -747297.6 1449.2 -4.46%9 -1673298.3 -732978.2 128.00 4596.0 -1947610.0 -747690.9 160.48 396.0 -2.01% -747699.1 1403.6 -2.01%10 -1636228.5 -730383.8 124.00 5073.0 -1947609.5 -747551.3 160.53 395.1 -2.35% -747561.8 1261.5 -2.35%

Aver. -1659403.5 -845260.1 112.00 4783.8 -1947610.6 -747648.5 160.50 404.4 3.63% -747662.4 2213.3 3.63%

From Table 5.4 and 5.5, we see that the Tabu method has better solutions than LR method,

however, the running time is longer than LR method. Different parameters about both methods

are tested to speed up the solving process in Appendix B. 3. We embedded LR and Tabu

methods into Progressive Hedging algorithm for whole problem in next Section.


5.4 Progressive Hedging for FP problem

5.4.1 Design a lower bound

Lagrangian relaxation technique is used to achieve a quality lower bound for model (5.8) - (5.16)

in this section. First we covert the equality non-anticipativity constraints (5.13) and (5.14) into

the following equivalent inequality constraints (5.43) - (5.46):

− xsij +∑s∈S

psxsij ≤ 0,∀ (i, j) ∈ A0s, ∀s ∈ S (5.43)

xsij −∑s∈S

psxsij ≤ 0,∀ (i, j) ∈ A0s,∀s ∈ S (5.44)

− gsi +∑s∈S

psgsi ≤ 0,∀i ∈ N ,∀s ∈ S (5.45)

gsi −∑s∈S

psgsi ≤ 0,∀i ∈ N ,∀s ∈ S (5.46)

Then we assign µs−ij ≥ 0 and µs+ij ≥ 0 to constraint (5.43) and (5.44) respectively and φs−i ≥ 0

and φs+i ≥ 0 to constraint (5.45) and (5.46) respectively. Then the Lagrangian objective for

the whole programs is:

LR LB (x, g, y, µ, φ)

=∑s∈S

ps

∑(i,j)∈Aos

cijxsij −

∑(i,j)∈A1s

(1− csij

)ysij

+∑s∈S

ps

∑(i,j)∈Aos

µs−ij

(−xsij +

∑s∈S

psxsij

)+ µs+ij

(xsij −

∑s∈S

psxsij

)+∑s∈S

ps

(∑i∈N

φs−i

(−gsi +

∑s∈S

psgsi

)+ φs+i

(gsi −

∑s∈S

psgsi

))

=∑s∈S

ps

∑(i,j)∈Aos

(cij + µs+ij − µ

s−ij − p

s∑s∈S

(µs+ij − µ

s−ij

))xsij −

∑(i,j)∈A1s

(1− csij

)ysij

+∑s∈S

ps

(∑i∈N

(φs+i − φ

s−i − p

s∑s∈S

(φs+i − φ

s−i

))gsi

)Setting U sij = cij +µs+ij −µ

s−ij −ps

∑s∈S

(µs+ij − µ

s−ij

)and Φs

i = φs+i −φs−i −ps

∑s∈S

(φs+i − φ

s−i

),

the primal problem becomes to minimize LR LB (x, g, y, µ, φ), i.e.:

minx,g,y

∑s∈S

ps

∑(i,j)∈Aos

U sijxsij +

∑i∈N

Φsigsi −

∑(i,j)∈A1s

(1− csij

)ysij

(5.47)


s.t. (5.31)− (5.36) (5.48)

(5.47) can be decomposed across different scenario, and the associated dual problem is

maxµ≥0,φ≥0 minx,g,y LR LB (x, g, y, µ, φ). The dual variables can be updated by sub-gradient

method that we did for the scenario sub-problem in Section 5.3.1. For each iteration, a La-

grangian lower bound and a feasible upper bound are generated at the same time in the proposed

Progressive Hedging algorithm in next section.

5.4.2 Progressive Hedging method

Note that maximize the final wealth implies minimize the transaction cost in the constraint,

we propose Progressive Hedging algorithm to adjust the cost ratios on the arcs so that the

first stage decision variables can converge as much as possible. We adjust the linear coefficient

of the first decision variable iteratively, the process can be seen as follows: at first stage we

choose portfolio components arbitrary, and the corresponding transaction cost can occur, i.e.

coefficient Csij = cij , to arc (i, j), as the scenarios have been revealed, we adjust Csij to arc (i, j)

and F si to arc (i0, i1) at the same time in object function. If gsi in some scenarios is 0 and most

of other scenarios are 1, we award the arcs come into the node i and penalize the arcs goes

out from node i in that scenarios so that more values can remain in the node i, meanwhile F si

decreased so that the flow can pass on arc (i0, i1). Conversely if gsi in some scenarios is 1 and

most of other scenarios are 0, we penalize the arcs come into the node i and award the arcs goes

out from node i in that scenarios so that more values can leave node i, meanwhile F si increased

so that the flow can leave the arc (i0, i1). We implement this strategy in following algorithm.

Progressive hedging Algorithm


v ←− 0

λs,vij ←− 0,∀ (i, j) ∈ A0s,∀s ∈ S

πs,vi ←− 0,∀i ∈ N ,∀s ∈ S

constraint (5.13) and (5.14)

ρv ←− ρ0

µs−,vij , µs+,vij ←− 0,∀ (i, j) ∈ A0s, ∀s ∈ S

φs−,vi , φs+,vi ←− 0,∀i ∈ N ,∀s ∈ S

constraint (5.43) - (5.46)

For all s ∈ S, do


Cs,vij ←− cij ,∀ (i, j) ∈ A0s

F s,vi ←− 0,∀i ∈ N

Solve the corresponding FP sub-problem (5.30) - (5.36)

by LR method and Tabu search method

xvij ←−∑

s∈S psxs,vij , ∀ (i, j) ∈ A0s, ∀s ∈ S

gvi ←−∑

s∈S psgs,vi ,∀i ∈ N ,∀s ∈ S

Calculate and evaluate gM,v (First K nodes with largest probability in gvi ).

Calculate and evaluate xM,v (aggregation value of xs,vij under gM,v).

feasible upper bound ←−(xM,v, gM,v

)Step 1: (Coefficient adjustment)

v ←− v + 1. For all s ∈ S, do

Cs,vij ←− cij + λs,v−1ij − ρv−1xv−1

ij , ∀ (i, j) ∈ A0s

F s,vi ←− 0 + πs,v−1i − ρv−1gv−1

i + ρv−1

2 , ∀i ∈ N

Solve the corresponding FP sub-problem (5.30) - (5.36)

by LR method and Tabu search method

xvij ←−∑

s∈S psxs,vij , ∀ (i, j) ∈ A0s, ∀s ∈ S

gvi ←−∑

s∈S psgs,vi ,∀i ∈ N ,∀s ∈ S

Calculate and evaluate gM,v (First K nodes with largest probability in gvi ).

Calculate and evaluate xM,v (aggregation value of xs,vij under gM,v).

Update minimal upper bound if(xM,v, gM,v

)gives current best.

For all s ∈ S, do

U s,vij ←− cij + µs+,vij − µs−,vij − ps∑

s∈S

(µs+,vij − µs−,vij

), ∀ (i, j) ∈ A0s

Φs,vi ←− 0 + φs+,vi − φs−,vi − ps

∑s∈S

(φs+,vi − φs−,vi

),∀i ∈ N

Generate lower bound by solving the corresponding FP sub-problem (5.47).

Calculate gapv = (UB−LB)|UB| .

Step 2: (Lagrangian multiplier and penalty update)

λs,vij ←− λs,v−1ij + ρv−1

(xs,vij − x

v−1ij

),∀ (i, j) ∈ A0s,∀s ∈ S

πs,vi ←− πs,v−1i + ρv−1 (gs,vi − gvi ) , ∀i ∈ N , ∀s ∈ S


ρv ←− αρv−1

µvij ←− max(

0, µv−1ij + tvµd

vµ

)(gradient method)

φvi ←− max(

0, φv−1i + tvφd

vφ

)(gradient method)

Step 3: (Move to next iteration)

Calculate δv =∑

s∈S ps

∥∥∥∥∥∥∥ xij

gi

s,v

−

xij

gi

v∥∥∥∥∥∥∥ , gapv = (UB−LB)

|UB| .

If δv ≥ ε or gapv ≥ η, GO TO Step 1.

The aggregation operator, i.e. gvi , define the opening or closing probability of the arc

(i0, i1), we open the first K node who has the largest probabilities and close others in gvi , then

the aggregation of xs,vij under gM,v is a feasible solution because for any s ∈ S, xs,vij is a feasible

point that satisfy constraint (5.9) and xM,v =∑

s∈S psxs,vij is also a feasible point to constraint

(5.9). Therefore the objective under(xM,v, gM,v

)is a feasible upper bound.

5.4.3 Numerical experiment

We test large size instances in this section. First we list the computational result for different

types of problem in literature in Table 5.6.

Table 5.6: Computational result in literature

# of variable # of # of Maximal Time

# of

integer

Total

variableconstraint scenario iteration (min)

Gade et al.

(2013) [58]

1,200

(binary)16,194 24,092 10 958 /

Crainic et al.

(2011) [43]

10,800

(binary)874,800 225,800 90 50 312.68

Watson and Woodruff

(2011) [147]

300

(binary)405 1,140 10

321.8

(Aver.)

321.8

(Aver.)

Veliz et al.

(2011) [145]

50,544

(binary)77,760 110,856 324 / 71.5

Lokketangen and Woodruff

(1996) [103]

N(binary)

2N +N2 2N 10 / 67.8

Takriti et al. (1996) [142]2,400

(binary)4,800 / 22 924 220


From Table 5.6, we see that the largest scenario number equal 324 in [145], however, the

total variable number is only 77,760. Crainic et al. [43] solved their problem with 90 scenarios

and 874,800 variables, which is impressive. For our FP problem, we have(2N2 +N

)S variables

and(N2 + 3N + 1

)S constraints. We set N = 100, and test the performance of PH method

under S = 15, 30, 50 and 75. Then our PH problem includes 1,507,500 variables and 772,575

constraints for largest size instance. Table 5.7 list the parameter setting of our computation for

the models and PH algorithm, the parameter Rsi are generated by moments matching method

in Section 5.2.2. All instances were implemented on a 2.66 GHz computer with 3 GB memory

available.

Table 5.7: Parameter setting for the model and PH algorithm

For Model For PH algorithm

bi 100 Outer loop iteration # 7

lxi 10 Iteration # for sub-gradient method 60

uxi sum(bi) Iteration # for Tabu search method 10

c0ij .05 ρ0 1 + log(arc#) (1 +D0)

csij rand(.08 .05 .02) D0 The inconsistency level

lysij 0

uysij sum(bi)

The initial ρ0 is determined by the inconsistency level D0, i.e. the number of arcs and nodes

for which there is non-consensus amongst the scenario solutions [43].

Table 5.8: Bound details under different methods for S=15(N,K,S) Solve by Gurobi PH with LR sub-solver

PH withTabu sub-solver

1− 1−

BestLB

Feasi.UB

Gap(%)

Time(s)

BestLB

Feasi.UB

Gap(%)

Time(s)

Feasi.UB

Time(s)

UBLRUBTabu

TLRTTabu

(100,5,15) -49266.3 -6569.5 649.9 35966.3 -7269.9 -6044.3 20.3 13143.8 -6334.7 8905.2 -4.80% -32.25%(100,10,15) -98532.6 -17686.8 457.1 35996.7 -19640.3 -17301.3 13.5 17311.9 -16689.9 11452.9 3.53% -33.84%(100,15,15) -143613.3 -28574.8 402.6 35995.6 -30861.8 -28492.7 8.3 24840.5 -28438.6 26276.9 0.19% 5.78%(100,20,15) -182081.9 -43744.7 316.2 35995.7 -46105.3 -43757.9 5.4 20682.3 -43227.7 27904.9 1.21% 34.92%(100,25,15) -205748.2 -429.8 47772.4 35990.1 -66188.9 -63499.2 4.2 26392.5 -63141.5 37910.9 0.56% 43.64%(100,30,15) -225297.2 -517.5 43432.8 36000.6 -90249.4 -88036.1 2.5 28434.8 -87500.7 49680.5 0.61% 74.72%(100,35,15) -244009.9 -605.2 40221.7 36000.6 -118826.1 -116432.3 2.1 31894.4 -116176.2 61983.4 0.22% 94.34%(100,40,15) -262026.6 -694.5 37626.9 36000.6 -151985.4 -149770.7 1.5 46146.0 -148147.4 53219.7 1.08% 15.33%

Aver. -176322.0 -12352.9 21360.0 35993.3 -66390.9 -64166.8 7.2 26105.8 -63707.1 34666.8 0.33% 25.33%

The running time for Gurobi is set as 10 hrs in Table 5.8, and the scenario sub-problems

are solved by LR method and Tabu search method separately. It is clear to see that the gaps

decreased as K increased, but the gap via PH shrank more quickly than Gurobi. PH methods

returned a better solution than Gurobi when K = 20. From K = 25 to 40, Gurobi cannot return

a quality solution and the solution from PH methods still be considerable from the practice


point of view.

Compared with the two PH methods with different sub-solver, the Tabu search can return

a better solution in a shorter time when K = 5. When K is greater than 5, the solution is close

each other but the running time of Tabu search is larger than LR method.



1− 1−

BestLB

Feasi.UB

Gap(%)

Time(s)

BestLB

Feasi.UB

Gap(%)

Time(s)

Feasi.UB

Time(s)

UBLRUBTabu

TLRTTabu

(100,10,30) NaN -93.6 NaN 36079.8 -19997.3 -17398.0 14.9 58935.7 -17143.3 84902.2 1.46% 44.06%(100,15,30) NaN -140.1 NaN 36003.2 -30835.0 -28254.9 9.1 60523.4 -28082.4 90442.1 0.61% 49.43%(100,20,30) NaN -186.6 NaN 86378.8 -46307.8 -43433.3 6.6 73628.4 -43440.0 105044.2 -0.02% 42.67%

Aver. NaN -140.1 NaN 52820.6 -32380.0 -29695.4 10.2 64362.5 -29555.2 93462.8 0.47% 45.21%

From Table 5.9, we can see that Gurobi cannot return the lower bound for S = 30, and

the quality of the feasible solution is poor. On the other hand, both PH methods can return

reasonable bounds within a limited time. The solutions between LR and Tabu search are close,

but the Tabu search consumed averagely 45% more time than LR method.



1− 1−

BestLB

Feasi.UB

Gap(%)

Time(s)

BestLB

Feasi.UB

Gap(%)

Time(s)

Feasi.UB

Time(s)

UBLRUBTabu

TLRTTabu

(100,20,50) / / / 16 hrs -45982.5 -43751.4 5.1 182530.5 -43020.7 239713.6 1.67% 31.33%(100,25,50) -66156.7 -63434.7 4.3 185483.1 -62840.3 269307.1 0.94% 45.19%(100,30,50) -90247.3 -87794.6 2.8 177666.3 -85930.3 307313.8 2.12% 72.97%

Aver. -67462.2 -64993.6 4.1 181893.3 -63930.4 272111.5 1.64% 49.60%

For instances S = 50, Gurobi fail to solve the problem in 16 hrs in Table 5.10. Meanwhile,

the LR methods return the considerable bounds consistently. The average gap between lower

and upper bounds is 4.96% and the average running time is around 50 hrs. From the last two

columns, we can see that the Tabu search method is more time expensive than the LR method.

The same tendency of the solution occurred for larger scenario S = 75 in Table 5.11.



1− 1−

BestLB

Feasi.UB

Gap(%)

Time(s)

BestLB

Feasi.UB

Gap(%)

Time(s)

Feasi.UB

Time(s)

UBLRUBTabu

TLRTTabu

(100,20,75) / / / / -45801.2 -43804.1 4.6 345649.0 -43394.5 496517.2 0.93% 43.65%(100,25,75) -66016.7 -63580.6 3.8 365281.0 -63578.5 512597.3 0.00% 40.33%(100,30,75) -90143.4 -87995.3 2.4 370418.0 -88045.4 516481.3 -0.06% 39.43%

Aver. -67320.4 -65126.7 3.6 360449.3 -65006.1 508531.9 0.19% 41.08%

In a nutshell, our numerical results showed that the proposed PH method with different

sub-solvers has consistent performance. More numerical testing will list for the index tracking

problem with network structure.


5.5 Progressive Hedging for Index Tracking problem

We extend the FP framework to index tracking problem in this section. The objective function

of FP problem, (5.8), is modified as follows:

min

∣∣∣∣ ∑i∈N

xi − I0

∣∣∣∣+∑s∈S

ps∣∣∣∣ ∑i∈N

ysi − I1s

∣∣∣∣+∑

(j,i)∈A0

cijxij −∑s∈S

ps

( ∑(i,j)∈A1s

(1− csij

)ysij

)

where I0 is the market value of the target index at stage 0 and I1s is the market value of the

target index at stage 1 under scenarios s.

The objective can be seen as a trade-off between the goals that minimize the tracking errors

for both stages and maximize the final wealth at last stage. Different types of decision maker

may emphasis different aspects, and we can assign a weights vector (α, β) to the goals. For

example, setting the ratio β/α = 1/10 denotes the goal of minimizing the tracking error is 10

times important than the goal of maximizing the final wealth, and vice versa. We first set the

ratio β/α = 1/1 and test more ratios in this section late.

The objective can be linearized by introducing new variables X,X and Ys, Y s:∣∣∣∣ ∑

i∈Nxi − I0

∣∣∣∣ = X +X∑i∈N

xi − I0 = X −X

X,X ≥ 0

Stage 0,

∣∣∣∣ ∑i∈N

ysi − I1s

∣∣∣∣ = Ys

+ Y s

∑i∈N

ysi − I1s = Y

s − Y s

Ys, Y s ≥ 0

Stage 1.

Then a network structure index tracking model can be formulated as follows:

min∑s∈S

ps

Xs+Xs + Y

s+ Y s +

∑(i,j)∈Aos

cijxsij −

∑(i,j)∈A1s

(1− csij

)ysij

(5.49)

s.t. (5.9), (5.10), (5.11), (5.12), (5.15), (5.16)

xsij = xij , ∀ (i, j) ∈ A0s, ∀s ∈ S (5.50)

gsi = gi, ∀i ∈ N, ∀s ∈ S (5.51)

Xs

= E(Xs),∀s ∈ S (5.52)

Xs = E (Xs) , ∀s ∈ S (5.53)

∑i∈N

xsi − I0 = Xs −Xs, ∀s ∈ S (5.54)

∑i∈N

ysi − I1s = Y

s − Y s, ∀s ∈ S (5.55)


Xs, Xs ≥ 0, Y

s, Y s ≥ 0∀s ∈ S (5.56)

Constraints (5.50) - (5.53) denote the non-anticipativity constraints for different variables.

After taken off the absolute sign, the whole program becomes an SMIP, and the same Progressive

Hedging procedure can be applied to above index tracking model. Similar to FP problem, we

test the index tracking model correspond to different scenarios. The initial investment bi to

node i is scaled by the market value weights. For example, suppose that the total market value

of the index SP100 at stage 0 is 10,000, i.e. 100 ∗N = 10, 000, and the weight of the first asset

is 0.01978352 according to the real data, then the initial cash on the first node is 197.8352.

Numerical result list in Table 5.12 - 5.15.

Table 5.12: Numerical result (N=100, K, S=15)

N=100

S=15Solve by Gurobi PH with LR sub-solver

KBest

LB

Feasi.

UB

Gap

(%)

Time

(s)

Best

LB

Feasi.

UB

Gap

(%)

Time

(s)

10 -88067.91 -17633.58 399.43 64797.51 -19367.23 -17366.64 11.52 21927.27

15 -127373.39 -28020.35 354.57 64793.63 -30865.99 -27835.44 10.89 27490.91

20 -154769.14 19050.05 912.43 64796.69 -45358.33 -42198.70 7.49 25854.55

25 -175150.75 19051.59 1019.35 64798.02 -63277.16 -61404.64 3.05 29290.91

Aver. -136340.29 -1888.07 671.4564796.46

(18 hrs)-39717.18 -37201.36 8.24

26140.91

(7.3 hrs)

The running time set as 18 hrs for Gurobi, we see that the average running time by PH

method is 7.3 hrs from Table 5.12. Meanwhile, the objective values by PH method are close to

Gurobi when K = 10, 15 and PH method return better solutions for instance K = 20, 25. PH

method also returns better lower bounds for all instances, which makes the average gap (8.24%

averagely) is much better than the gaps by Gurobi (671.45% averagely).



N=100


KBest

LB

Feasi.

UB

Gap

(%)

Time

(s)

Best

LB

Feasi.

UB

Gap

(%)

Time

(s)

10 NaN 20190.1678 NaN 64794.73 -19604.6046 -17609.1491 11.33 202203.9

15 NaN 20191.3594 NaN 64824.99 -30743.2959 -28319.4004 8.56 208380.8

20 NaN 20192.5114 NaN 64801.39 -46115.7606 -42890.0485 7.52 208380.7

Aver. NaN 20191.3462 NaN64807.04

(18 hrs)-32154.5537 -29606.1993 9.14

206321.8

(57 hrs)

From Table 5.13, we obtain the same tendency that our PH method returns higher quality

solutions than that by Gurobi. As scenario increased, Gurobi can only return heuristic solutions

within the setting time, and such solutions are not practical for FP problem.


N=100


KBest

LB

Feasi.

UB

Gap

(%)

Time

(s)

Best

LB

Feasi.

UB

Gap

(%)

Time

(s)

10 NaN NaN NaN NaN -18923.4467 -17393.4559 8.80 285362.35

15 NaN NaN NaN NaN -30234.0302 -28785.9878 5.03 283961.49

20 NaN NaN NaN NaN -45664.6593 -43492.7358 4.99 278422.73

Aver. NaN NaN NaN NaN -31607.3787 -29890.7265 6.27282582.19

(78 hrs)


N=100


KBest

LB

Feasi.

UB

Gap

(%)

Time

(s)

Best

LB

Feasi.

UB

Gap

(%)

Time

(s)

10 NaN NaN NaN NaN -19198.9195 -17593.8204 9.12 653910.86

15 NaN NaN NaN NaN -29386.032 -28979.2587 1.40 600369.88

20 NaN NaN NaN NaN -44864.5413 -43872.3357 2.26 626700.67

Aver. NaN NaN NaN NaN -31149.8309 -30148.4716 4.26626993.8

(174 hrs)

Note: ”NaN” denote out of memory.

From Table 5.14 and 5.15, Gurobi cannot start the solving process because of memory issue

for loading the large size coefficient matrix, while the PH method returns practical solutions

consistently, and the gap between the lower and upper bound is relatively small (around 5%).


We compare the PH running time for FP and index tracking problems in Figure 5.3, we can

see that the running times is nearly linear increase correspond to the scenario number. The

running time for index tracking are larger than the time of FP problem, this is reasonable since

more variables and constraints are included in the index tracking problem, and different goals

need to be satisfied.

Figure 5.3: Running time of PH method for different problems

Next we test more instances for different β/α ratios. We set β/α equals 1/10, 1/1, and 10/1

respectively to represent different weights on different goals. Table 5.16 list the gap between

bounds and solving time for different K and S.

Table 5.16: Test different ratios (N=100, K, S)β/α = 1/10 β/α = 1/1 β/α = 10/1

N=100(K,S)

BestLB

Feasi.UB

Gap(%)

Time(hrs)

BestLB

Feasi.UB

Gap(%)

Time(hrs)

BestLB

Feasi.UB

Gap(%)

Time(hrs)

(10,15) -18885.94 -18818.75 0.36 20.92 -17661.27 -17450.92 1.21 18.9 -7271.14 -7270.49 0.01 1.88(15,15) -31261.15 -31182.83 0.25 21.34 -28335.42 -28114.99 0.78 17.96 -9929.25 -9821.07 1.10 13.17(20,15) -48158.45 -48068.43 0.19 21.05 -42910.59 -42606.68 0.71 20 -12584.26 -12435.49 1.20 11.15(25,15) -69994.61 -69828.20 0.24 20.89 -61664.94 -61530.60 0.22 20.58 -15134.02 -15043.46 0.60 16.02(10,30) -18938.14 -18752.38 0.99 47.96 -17681.84 -17468.28 1.22 46.67 -7305.30 -7301.91 0.05 32.11(15,30) -31298.84 -31246.70 0.17 51.14 -28407.84 -28314.45 0.33 47.75 -9997.73 -9858.53 1.41 35.09(20,30) -48197.47 -48063.48 0.28 48.13 -42967.14 -42682.82 0.67 46.72 -12650.53 -12493.02 1.26 30.61(10,50) -18998.09 -18671.31 1.75 106.17 -17730.03 -17393.46 1.94 103.27 -7028.82 -6960.84 0.98 90.19(15,50) -31780.48 -31777.70 0.01 102.09 -28964.65 -28785.99 0.62 100.44 -10118.86 -9847.70 2.75 92.95(20,50) -48772.07 -48685.96 0.18 102.15 -43016.55 -42741.15 0.64 101.47 -12694.55 -12387.12 2.48 90.48Aver. / / 0.44 54.18 / / 0.83 52.38 / / 1.18 41.36


From Table 5.16 we see that the average gaps are 0.44%, 0.83% and 1.18% for ascent β/α

ratios. Meanwhile, the average solving time are 54.18 hrs, 52.38 hrs and 41.36 hrs. This can

be explained that for the index tracking the PH algorithm mainly adjust the node value on the

network while for the FP problem needs to adjust both the value on the nodes and arcs during

the iterations, which makes the gap of β/α = 10/1 larger than gap of ratio β/α = 1/10. Again

The running time linearly increased with respect to the scenario number S. Overall the PH

method can return the high-quality solution for index tracking problem.


We incorporated cardinality restriction into Financial Planning problem (LP) and developed

it into an SMIP. Inspired by real application, we decomposed the SMIP corresponding to

scenarios and effectively solved large size instances for FP and index tracking problems by

proposed Progressive Hedging Algorithm. Subgradient and Tabu search methods are applied

to Progressive Hedging framework to speed up the solving process. Numerical experiments

showed that our method can efficiently solve the SMIP with large scenario number.

Chapter 6

Lagrangian Relaxation for

Cardinality Constrained Conic

Programming

In Chapter 5, we applied the stochastic mixed integer programming to protect against the un-

certainty of asset returns in Financial Planning problem. In this section we will study the robust

optimization that can also immune to the parameter uncertainties of both return and variance

for index tracking problem which can be captured by presented Cardinality Constrained Conic

Programming.

6.1 Introduction to CCCP

Given a variable vector (x, t, y) ∈ Rn+L+n, cardinality constrained conic programming (CCCP)

can be written as:

min cTx+ dT y (6.1)

s.t. ‖Aix+ bi‖ ≤ ti, ∀i = 1, · · · , L (6.2)

Ex+Gt ≤ f (6.3)

eT y = q (6.4)

ljyj ≤ xj ≤ ujyj , ∀j = 1, · · · , n (6.5)

93

Chapter 6. Lagrangian Relaxation for CCCP 94

y ∈ {0, 1} (6.6)

where c, d ∈ Rn, E ∈ Rm×n, G ∈ Rm×L, f ∈ Rm, e ∈ Rn where all components equal 1, and lj ,

uj denote the lower and upper bounds for variable xj . ‖•‖ denotes the standard Euclidean norm,

i.e. ‖z‖ =√zT z. Constraint (6.2) indicates that variable (x, ti) lies in ith Lorentz cone with

dimension (pi + 1) and parameters Ai ∈ Rpi×n, bi ∈ Rpi , ti ∈ R. Without cardinality restriction

for variable x, i.e. constraints (6.4) - (6.6), the CCCP reduces back to a Second-Order Cone

Programming (SOCP), which has been well-studied in literature [102, 6].

Model (6.1) - (6.6) is a primary class of Mixed-Integer Second-Order Cone Programming

(MISOCP) and has significant influence on theory and application [15], and it is one general-

ization of mixed 0-1 linear or quadratic programs. This problem is particularly interesting to

us from both methodology and application points of view. First the proposed CCCP is non-

convex and therefore an NP-hard problem because of the binary requirement (6.6), and finding

an optimal or near-optimal solution of large-scale CCCP within a reasonable time has proven

to be a puzzle for researchers in optimization for years. Numerical approaches, which can be

generally categorized by exact or inexact methods, emerged to globally shrink the bounds’ gap

and achieve a good solution. Exact methods typically explore only part of variable space by

using pruning rules and ordering heuristics to avoid visiting all variable space, and meanwhile

maintaining the feasibility. Inexact strategies mainly utilize the local search techniques to e-

valuate a small neighborhood of current solutions, and quickly move to a better solution by

following a promising direction. There are substantial similarities and fundamental distinction-

s between the exact and inexact methods. Both method groups try to convert the NP-hard

problem to tractable sub-cases so that associated relaxed bounds and feasible solution can be

iteratively improved. The difference between them is that the exact method can theoretically

guarantee the optimal solution but requires exponential running time, while the inexact method

can quickly produce a reasonable solution but cannot guarantee the optimal solution.

Secondly the proposed CCCP is an important mathematical tool to handle various problems

in real-life. For example, for portfolio selection in finance, the cardinality constraints (6.4) -

(6.6) control the portfolio size while the conic constraint (6.2) is usually used to restrict or

minimize the portfolio variance. Investors are struggling to decide a trade-off between the size

and risk of a portfolio. On one hand the risk constraint (6.2) may be easily violated if the


portfolio concentrates on a few assets i.e. q is small. On the other hand fully replicating the

market (large q) is inefficient due to transaction costs, management fees, and other concerns

which are captured by constraint (6.3). Moreover, Model (6.1) - (6.6) can be seen as a naturally

extension of robust optimization since the parameter uncertainty sets can be formulated as the

conic form, and the cardinality constraint restricts the number of non-zero components in x.

Motivated by the real application, we design a Lagrangian based inexact approach that can

quickly approximate the optimal solution for the CCCP, and then use the CCCP framework to

deal with a type of index tracking problem under uncertain environment. More specifically, we

decompose the variable space into continuous and integer parts by relaxing connected constraint

(6.5), and as a result, the associated Lagrangian subproblems, i.e. one SOCP and one 0-1

knapsack problem, can be solved efficiently. Moreover a sub-gradient cut and fully regular cuts

are generated at each iteration to shrink the feasible set of {0, 1}n structure. Computational

observation shows that the sub-gradient cut can significantly speed up the solving process. The

proposed Lagrangian relaxation scheme enriches the solving methodology for CCCP, and we

show the effectiveness of the LR method through a comparison with Gurobi’s mixed integer

SOCP solver. To the best of our knowledge, our work is the first paper to focus on the relaxation

of the boundary constraint (6.5) for the CCCP in current literature.

We organize the rest of chapter as follows: We display a literature review for CCCP and

associated applications and methodologies in Section 6.2, and then propose a Lagrangian de-

composition scheme in Section 6.3. In Section 6.5, we compare our computational results with

those from Gurobi’s mixed integer SOCP solver to illustrate the effectiveness of the LR method.

Finally, Section 6.6 concludes our work.

6.2 Literature Review

Second-Order Cone Programming (SOCP), i.e. Model (6.1) - (6.3), includes linear program-

ming (LP), convex quadratic programming as special cases, and it also is one special case of

semidefinite programming (SDP). Therefore, SOCP strengthens the ties between the linear

programs and non-linear convex programs, and it attracts many researchers to solve SOCP

efficiently in last two decades. The extension of existing primal-dual based methods, i.e. inte-


rior point method and active set method, from LP to SOCP is a natural transition, and such

transitions have been proven successfully for solving large size optimization problems. Lobo

et al. [102] showed that many engineering problems can be generalized as SOCP and pre-

sented a primal-dual based interior point method which generally requires 5 - 50 iterations in

their work. Alizadeh and Goldfarb [6] studied the algebraic properties of Jordan frames for

the second-order cone, and adopted a norm-2 centrality measure to obtain a polynomial time

interior point method. They pointed out that numerical stability of the method is available on

testing of both real application and randomly generated problems. Also by using the concepts

of Jordan algebra, Tsuchiya [143] analyzed the complexity of variants of primal-dual path fol-

lowing methods for SOCP via extension of Nesterov and Todd (NT) direction [117, 118] and

HRVW/KSH/M direction [91, 110, 78] from that of SDP. His work proved that both type-

s of algorithm reserve polynomial iteration-complexity which is relevant to the number of the

second-order cones. Kuo and Mittelmann [97] extended and developed the interior point method

in [7] to SOCP, and displayed the robustness of their method through the comparison with dif-

ferent solvers on testing extreme instances for many Operations Research (OR) problems. As a

matter of fact, there is a large amount of research showing that interior-point based algorithms

have polynomial time complexity for SOCP, LP and SDP [111, 112, 133]. Meanwhile, software

packages based interior-point methods are currently available to efficiently handle SOCPs or

convex programs, e.g. SeDuMi [140], MOSEK [113], CPLEX [42], CVX [68], and GUROBI

[71].

Active-set based extension for SOCP also draws a lot of attention from authors. Erdougan

and Iyengar [52] studied a single-cone SOCP by dualizing its nonnegativity constraint to obtain

Lagrangian subproblems where the nullspace of coefficient matrices are projected onto associ-

ated orthonormal basis. They compared their active set method with SeDuMi for the randomly

generated network flow problems. Goldberg and Leyffer [65] recently designed a two-phase ac-

tive set method which firstly identified the active cones and then applied Newton-like method

to quickly obtain the solutions of sub-SOCPs. Numerical comparison with interior-point based

solvers were displayed in their work. Although polynomial analysis is not available, active set

methods exhibit nice property that interior point methods lack, i.e. it can obtain vertices for

MISOCP relaxation in the nodes of the branch-and-bound search tree. Aside from the inte-


rior point method and active set method, simplex based approaches for SOCPs were studied

in [148, 67], and the method using polyhedral approximations of the second-order cone was

investigated in [14].

While both theories and methodologies for SOCPs have been well-established, MISOCP

is relative new but more attractive since it has more broad applications. A comprehensive

survey about MISOCPs was compiled by Benson and Saglam [15]. They formulated numerous

examples in fields of operations management, engineering, and machine learning as MISOCPs,

and reviewed different approaches that solve MISOCP in the literature. Another interesting

application was introduced by Miyashiro and Takano in [109]. The authors improved the fitting

ability of a multiple linear regression model via MISOCP formulations to select a limited number

of factors. The benefits by extending the strong dual theory from LP to convex programs

does not exist for integer programming, and thus solution methods for MISOCP will be more

challenging and will mainly rely on the heuristic methods such as branch-and-cut algorithms

that solve the SOCP or SDP relaxations to reduce the number of nodes visited in the search tree.

Different advanced cuts for MISOCP were explored in the present literature. Cezik and Iyengar

[30] generated the linear cuts based on Chvatal-Gomory (C-G) procedure and convex quadratic

cuts (e.g. lift-and-project cut) through tighter relaxations to approach the convex hull of the

solution set. However, updating the dual vector in self-dual cone for C-G linear cut generation

is not clear in their work. Atamturk and Narayanan [8] showed that their conic mixed-integer

rounding cuts can efficiently reduce the root nodes of the branch-and-bound tree for solving

MISOCP problems. Drewes and Pokutta [49] derived a strong binary symmetric cut for a special

class of MISOCP where binary variables only occur within the conic constraints by extending

the Sherali–Adams framework [137]. Besides the cut generation based algorithms, there are also

other inexact methods such as outer approximation approach using subgradient linearizations

[50], non-linear reformulation to original MISOCP by smoothing and regularization [16] for

solving MISOCP.

The proposed CCCP can be seen as one special case of general MISOCP where binary vector

y affects the continuous variable x only in the linear polyhedrons, so the reviewed above meth-

ods for general MISOCP could be also applied to our CCCP. Moreover, the special structure

of the constraint set allows us to adopt the decomposition advantages which widely used in LP


and mixed-integer LP and benefit of SDP relaxation for binary variable. Cardinality restriction

(6.4) and binary requirement (6.6) are crucial for the presented CCCP. Numerous studies seek

to deal with these two hard constraints and the methodologies can be classified into two cate-

gories. The first group either reformulates binary variable yj ∈ {0, 1} as constraints y2j −yj = 0,

yj ∈ [0, 1] or reconstruct the cardinality constraint into a non-convex SDP form, and utilizes

the semidefinite relaxation to the developed non-convex programs. Poljak et al. [123] explored

the equivalence of quadratic and semidefinite relaxations for 0-1 quadratic programming, and

applied their technique to different practical problems. Galli and Letchford [60] derived an

equivalent SDP relaxation that may generate tighter Lagrangian bound for the 0-1 QP in their

work. d’Aspremont et al. [47] employed a l2-norm to approximate the cardinality and explain

the robustness and sparsity of the solution. Chen et al. [33] suggested that lp∈[0,1]-norm regular-

ization can achieve better performance for the sparse portfolio management. However, l-norm

regularization cannot control the size exactly and the associated solution cannot guaranteed

the dualized rank one constraint.

In contrast with the SDP form reformulation, another main category of methodologies fo-

cuses on the cut generation in branch-and-bound tree or heuristics design for satisfying of

cardinality constraint (6.4). Bienstock [23] replaced constraint (6.4) with tighter constraint∑j (xj/uj) ≤ q and generated a valid cut for branch-and-bound algorithm. They visualized

the advantage of his method via simple numerical examples. Bertsimas and Shioda [22] con-

tinuously investigated this alternative to quadratic programs by applying Lemke’s method to

sub-problems in branch-and-bound tree, and compared its performance with that of CPLEX

solver. In their work, the ”=” sign can be relaxed to a ”≤” sign because the optimal solution

is always obtained in the surface of convex hull of feasible set. However, such sign relaxation

is prohibited for our methodology in Section 6.3. Chang et al. [31] presented three types of

heuristic algorithm to handle the cardinality constraint set (6.4) - (6.6), but no comparison was

made with the optimal solution. Cui et al. [46] applied the factor model to simplify the param-

eters of MIQCQP, and obtained a better SOCP relaxation bound for Lagrangian subproblems.

However, the subproblems may be still hard to solve since they are keeping a MIQCQP. Also

the accuracy of parameter generation via factor model for MIQCQP needs to be examined

through historical backtesting.


We adopt ideas of decomposition and cut generation to develop our strategy for solving

the proposed CCCP. To the best of our knowledge, this strategy for the CCCP have not been

studied in formal literature. The method is described as follows. The CCCP is firstly divided

into to two independent but easier parts via dualizing the connected constraint (6.5), and then

be unified by adjusting the dual variables for relaxed constraint in the dual space. Both sub-

problems can be efficiently solved, e.g. the first part remains SOCP while the second parts is

linear 0-1 knapsack problem. Meanwhile, a sub-gradient cut and fully regular cuts are used to

exclude the sub-optimal points that have been explored during previous iterations. Tadonki

and Vial [141] pointed out that the boundary constraint (6.5) makes problem hard and handled

the MIQP through relaxing the hard constraint to generate C-G cuts for 0/1 sub-problems.

We also focus on the boundary constraint but our strategy is fundamentally different than they

used. The authors fixed variable x and then obtained associated variable y, i.e. (x, y) while we

use an reversed solving direction, i.e. (x, y), also the authors did not show the dual updating

process but we do. Another main difference is that the authors used the traditional C-G cuts

to speed up the procedure of solving sub-problem, while our cut based on the weak dual theory

is totally new for the sub-problem. We show the specific details for proposed method in next

Section.

6.3 Lagrangian Relaxation Scheme

Observe that constraint (6.5) connect the continuous variable x and binary variable y, so we

relax constraint (6.5) and decompose the model (6.1) - (6.6) into continuous part (SOCP asso-

ciated with variable (x, t)) and integer part (only y). Both parts are easily to solve since the

SOCP is a convex problem and integer part is a Knapsack problem with an easy constraint.

We assign π−j ≥ 0 and π+j ≥ 0 to constraint lbjyj−xj ≤ 0 and −ubjyj +xj ≤ 0 respectively.

Then we construct following Lagarangian term:

L(x, y, π−, π+

)=

∑n

j=1cjxj +

∑n

j=1djyj +

∑n

j=1π−j (lbjyj − xj) +

∑n

j=1π+j (−ubjyj + xj)

=∑n

j=1

(cj + π+

j − π−j

)xj +

∑n

j=1

(dj + π−j lbj − π

+j ubj

)yj

=(c+ π+ − π−

)Tx+

(d+ π−lb− π+ub

)Ty

= CTx x+ CTy y


where Cx and Cy are the adjusted coefficient vector associated variable x and y. Then above

L (x, y, π−, π+) can be decomposed into two separated Lagarangian sub-problems. The first

part can be seen as a Second-Order Conic programming, i.e. LR SOCP (x, π−, π+), as follows:

min (c+ π+ − π−)Tx

s.t. (6.2), (6.3)(6.7)

Model (6.7) can be efficiently solved by convex analysis and the associated methodology

since it is a SOCP. For example, interior point method is used to solve LR SOCP (x, π−, π+)

in our LR algorithm. The second part can be seen as a linear integer programming, i.e.

LR IP (y, π−, π+):

min (d+ π−lb− π+ub)Ty

s.t. (6.4), (6.6)(6.8)

Model (6.8) is a 0-1 knapsack problem with relative smaller size and thus can also be

efficiently solved by commercial solver because of its linear form structure. Moreover, different

types of inequalities are generated and added into sub-problem (6.8) to exclude infeasible and

inefficient feasible points of original problem. The first one is sub-gradient cut that dreives from

the weak dual theory where:

L(x, y, π−, π+

)= CTx x+ CTy y ≥ C

Txx+ C

Ty y = L

(x, y, π−, π+

)in which L (x, y, π−, π+) is the Lagrangian objective in last iteration ν − 1. We partition

the index set I of y from iteration ν − 1 into I(v−1)0 =

{j ∈ I|y(v−1)

j = 0}

and I(v−1)1 ={

j ∈ I|y(v−1)j = 1

}. Note that the right hand side L (x, y, π−, π+) may not enforceable in prac-

tice, we replace it by L (x, y, π−, π+) +4, where 4 > 0 is used to strengthen the lower bound.

Then after solving the LR SOCP (x, π−, π+) part, a sub-gradient cut can be generated at

current iteration:

CTy y ≥ L(x, y, π−, π+

)+4− CTx x (6.9)

4 can be set as a small constant positive or varied iteratively. We observed that constant

positive cannot or slightly improve the convergence time in our computation. In practice, we

set 4 as follows:

4 = max(

0, ε[min

{C

(x)j + C

(y)j |j ∈ I

(v−1)0

}−max

{C

(x)j + C

(y)j |j ∈ I

(v−1)1

}])where ε is a scale to adjust the lower bound enhancement 4. The empirical value of ε decreased

as iteration number increased since the strengthening became harder and harder as the bounds

converged to each other. For instance, ε = 1 when v ≤ V/3, and ε = 10−4 when v ≤ V/3 where


V is designed iteration tolerance.

While inequality (6.9) is used to strengthen the Lagrangian lower bound, the following

inequalities are applied to swap the nodes and test the infeasibility of original model. First at

current iteration, the following inequality need to be satisfied if better solution exist:∑j∈I(v−1)

0

yj ≥ 1 (6.10)

Inequality (6.10) indicates that there at least one node in I(v−1)0 and one node in I

(v−1)1

exchange each other. If the ”=” sign in (6.4) is relaxed to ”≤” sign, we cannot guarantee that

always pairs of node are switched between the sets in I(v−1)0 and I

(v−1)1 . Therefore, we prohibit

the sign relaxation in our method. Inequality (6.9) and (6.10) can exclude the sub-optimal

points that have been explored during the whole iteration process.

Second, given a fixed y by model (6.8) at current iteration ν we slove the rest part of original

model (6.1) - (6.6) to produce a feasible solution as follows.

min cTx (6.11)

s.t. ‖Aix+ bi‖ ≤ ti, ∀i = 1, · · · , L (6.12)

Ex+Gt ≤ f (6.13)

xj ∈[ljyj , ujyj

], ∀j = 1, · · · , n (6.14)

If the resulted SOCP (6.11) - (6.14) is infeasible, we add following inequality to sub-problem

(6.8) and resolve above SOCP to obtain a feasible solution to original problem:∑j∈I(v)1

yj ≤ q − 1 (6.15)

Observation shown that these types of cut can speed up the whole LR method significantly.

A lower bound of original problem (6.1) - (6.6) can be obtained by solving the associated dual

problem of L (x, y, π−, π+) at current iteration as follows:

maxπ−,π+≥0

minx,y

L(x, y, π−, π+

)The dual problem returns either an optimal solution or maximal lower bound to model (6.1)

- (6.6). We then design following algorithm for solving original CCCP:

Lagrangian Relaxation algorithm


ν ←− 0, LBD ←− −∞, UBD ←−∞,


π−,vi ←− 0, π+,vi ←− 1,∀i ∈ N


C(v)x ←− c+ π+ − π−

Solve LR SOCP (x, π−, π+), i.e. model (6.7), for given C(v)x ,

C(v)y ←− d+ π−lb− π+ub

Add inequalities (6.9) and (6.10) to model (6.8)

Add inequality (6.15) to model (6.8) if necessary.

Solve LR IP (y, π−, π+) with added inequalities for given C(v)y ,

Update LBD ←− max(LBD,L

(x(v), y(v), π−,v, π+,v

))If(x(v), y(v)

)is feasible to constraint (6.5),

Update UBD ←− min (UBD,L (xv, yv, π−,v, π+,v)). STOP.

Else find a fesible solution x(v)adj in model (6.7) under y

(v)adj

from model (6.8), and calculate UBD(v)adj to model (6.1) - (6.6).

Update UBD ←− min(UBD,UBD

(v)adj

). GO TO Step 2.

Step 2: (Lagrangean multiplier update)

Build Lagrangian dual problem maxπ−,π+≥0 L(x(v), y(v), π−, π+

)1O π−,v+1

i ←− max(

0, π−,vi + αt(v)(lb

(v)i y

(v)i − x

(v)i

)), ∀i ∈ N

2O π+,v+1i ←− max

(0, π+,v

i + αt(v)(−ub(v)

i y(v)i + x

(v)i

)), ∀i ∈ N

where t(v) = (UBD − LBD) /

∥∥∥∥∥∥ lb(v)y(v) − x(v)

−ub(v)y(v) + x(v)

∥∥∥∥∥∥2

3O Solve LR IP(y, π−,v+1, π+,v+1

)and LR SOCP

(x, π−,v+1, π+,v+1

)While L

(xv, yv, π−,v+1, π+,v+1

)< L (xv, yv, π−,v, π+,v)

α = .5α, repeat 1O - 3O

Step 3: (Stop criteria)

Calculate Gapv = (UBD − LBD) / |UBD|

If Gapv > ε or v < V , v = v + 1. GO TO Step 1.


In practice we set tv = (UBD − LBD) /

∥∥∥∥∥∥ lb(v)y(v) − x(v)

−ub(v)y(v) + x(v)

∥∥∥∥∥∥2

, if LR(xv, ωv+1

)≤

LR (xv, ωv), then α = .5α and recalculate the lower bound until LR(xv, ωv+1

)> LR (xv, ωv).

Numerical experiments for models and designed method are shown in next section.

6.4 Robust Factor model to Index Tracking

6.4.1 Nominal Index Tracking Model

In this section we develop the nominal enhanced index tracking model. Let µ denote the vector

of expected returns of assets and Σ the covariance matrix, x is vector of portfolio weights, xBM

is the vector of weights of a benchmark index. Then the difference in expected returns (or

excess returns) between the tracking portfolio and the benchmark is µT (x− xBM ), and the

standard deviation of excess returns (tracking error) is

√(x− xBM )T Σ (x− xBM ). Here we

adopt the index tracking model from [40]. In this formulation, a portfolio x is sought that

maximizes expected return subject to a limit on portfolio risk and tracking error. The model

is given as:

max µTx (6.16)

s.t.∥∥∥Σ

12x∥∥∥ ≤ σ (6.17)∥∥∥Σ

12 (x− xBM )

∥∥∥ ≤ TE (6.18)

eTx = 1 (6.19)

x ≥ 0 (6.20)

where e ∈ Rn is a vector all of whose components equal 1, ‖•‖ denotes the norm value of a

vector. σ denotes the tracking portfolio risk limit and TE is the tracking error limit. For

example, TE = 5% means that a tracking portfolio may not have standard deviation of excess

returns of more than 5%. A cardinality constraint can be added into Model (6.16) - (6.20) to

control the portfolio size exactly:

max µTx (6.21)

s.t.∥∥∥Σ

12x∥∥∥ ≤ σ (6.22)


∥∥∥Σ12 (x− xBM )

∥∥∥ ≤ TE (6.23)

eTx = 1 (6.24)

eT y = q (6.25)

lbiyi ≤ xi ≤ ubiyi, ∀i (6.26)

x ≥ 0, y ∈ {0, 1} (6.27)

where lb, ub are the lower and upper bounds on the tracking portfolio weights, q is the port-

folio size. Model (6.21) - (6.27) can eaily be seen to be a Mixed-Integer Second-Order Cone

Programming (MISOCP) as the risk and tracking error constraints are quadratic with all other

constraints linear and a linear objective function and binary integer restrictions.

Next, we illustrate the nominal model by solving several instances. All instances were solved

to optimality by using the mixed integer solver in Gurobi which is based on branch-and-bound to

obtain zero gap between lower and upper bounds [71]. In particular, the effect of the tracking

error constraint (6.23) and the risk control constraint (6.22) are investigated by repeatedly

solving the model with increasing values for the parameter TE under different σ value. We

used daily returns from June 30, 2005 to December 31, 2007 to generate the parameters (µ,Σ)

for the model. We fixed σ with a large value, e.g. σ = 80 ∗max (diag (Σ)), then increase TE

with a given portfolio size, then we change σ to a smaller value, e.g. σ = 0.004∗max (diag (Σ)),

and repeat the same computational process by changing TE value. The parameters (µ,Σ)

were estimated through linear regression, specifically a three-factor model was applied for our

estimation (see details in Section 6.5.1). We computed instances over different q sizes that

represent low, medium and high density portfolios. For example, we chose the portfolio size

from q ∈ [5, 65] as we found that tracking portfolios will be very close to the index when q over

75, but this will generate higher transaction costs due to holding more assets. We compared

the portfolio return, variance and Sharpe ratio with different portfolio tracking sizes q in Figure

(6.1) - (6.3).


Figure 6.1: Portfolio return vs TE with different q under different σ (SP100)

Figure (6.1) shows the portfolio return over TE and q under different σ. Moderate track-

ing portfolio sizes (q = 15, 35) have higher return than larger or smaller sizes (q = 65 or 5).

When q = 65 the effects of diversification become stronger reducing return. From the figure

we see that the portfolio return sublinearly increases which means the marginal return de-

creases with respect to TE value. However, the portfolio returns are generally better than

the return of benchmark used i.e. the S&P100 (0.35�). As the parameter σ decreased to

0.004 ∗max (diag (Σ)), the risk control constraint (6.22) dominate the tracking error constraint

(6.23), and therefore the portfolio return cannot be improved via changing the TE value after

0.8*10e-4.

Figure 6.2: Portfolio variance vs TE with different q under different σ (SP100)

The portfolio variance increases approximately linearly with respect to TE, see left side on


Figure (6.2), for moderate portfolio sizes. The tracking portfolio with size q = 65 has lower

variance due to the diversification effects of having more assets. The results suggest that if

larger TE values are allowed, the portfolio return can be improved, however, the portfolio

variance may increase quicker than the improvement of return. The variance of S&P100 is

lowest out all portfolios most likely due to the diversification effect from having more assets. In

particular, the S&P100 variance is 0.06� and associated standard deviation is 0.24�. Again

we note that parameter σ can also significantly affect the portfolio variance, see right side on

Figure (6.2). The portfolio variance will be invariant to TE after 0.6*10e-4 if σ set too small.

Figure 6.3: Portfolio Sharpe ratio vs TE with different q under different σ (SP100)

We combine the portfolio return on Figure (6.1) and variance on Figure (6.2) to obtain the

portfolio Sharpe ratio on Figure (6.3). From Figure (6.3), the portfolio Sharpe ratio increases

with respect to increasing TE, however, the marginal portfolio Sharpe ratio decreases since

the marginal variance dominates the marginal return. Thus, by controlling the TE one can

improve portfolio performance, but increasing TE as shown above can lead to increased portfolio

volatility in Figure (6.2). Thus, the one must be careful about setting TE and σ too high if

one cares about risk. Also, the results suggest that one way to help attain enhanced indexing is

to not set q too high. We see that the Sharpe ratio of the benchmark index S&P 100 (-0.0047

in left side and -0.0049 in right side) is worse than the Sharpe ratio of the tracking portfolios.

From both sub-figures, we see that the portfolio Sharpe ratios can be significantly affected by

both model parameters TE and σ.

In the next section we develop the robust counterpart to model (6.21) - (6.27) to errors in


parameter estimation.

6.4.2 Robust Multi-Factor Model for Index Tracking

We follow as in Goldfarb and Iyengar [66] by employing a robust factor modeling approach.

Suppose the return vector r is given by the model:

r = µ+ V T f + ε

where µ ∈ Rn is the vector of mean returns, f ∼ N (0, F ) ∈ Rm, is the vector of returns of

the factors that drive the market, V ∈ Rm×n is the matrix of factor loadings of the n assets,

and ε ∼ N (0, D) is the vector of residual returns where D = diag(d), d = [di], i = 1, · · · , n. In

practice we may need F � 0. The assumptions for factor model include:

� residual returns εi and εj are independent, i.e. cov ([εi εj ]) = 0 for i 6= j;

� residual return εi and factor return fj are independent, i.e. cov ([εi fj ]) = 0.

Then E (r) = µ, σij = V Ti FVj , i 6= j, σii = σ2

i = V Ti FVi + di, σi =

√V Ti FVi + di, or

written in matrix form Σ = V TFV + D. Given a weight vector x, the risk of a portfolio, i.e.

xTΣx, can be split as a combination of a systematic risk, i.e. xTV TFV x, and a individual

risk, i.e. xTDx, within a portfolio [92]. Then building uncertainty sets around Σ is equivalent

to build the uncertainty sets for terms V TFV and D separately. We assume that the market

is stable, i.e. F is constant, then generators of uncertainty for parameters (µ,Σ) comes from

the generators of uncertainty for the parameters (µ, V,D). We follow as in [66] and design the

uncertainty sets for parameters (µ, V,D) separately as follows:

� The uncertainty sets Sm and Sd for parameters D and µ are defined as intervals:

Sd ={D : D = diag(d), di ∈

[di, di

], i = 1, · · · , n

}(6.28)

Sm ={µ : µ = µ0 + ξ, ξi ∈

[γi, γi

], i = 1, · · · , n

}(6.29)

� The uncertainty set for parameter V belongs to an ellipsoid:

Sv ={V : V = V0 +W, ‖Wi‖g ≤ ρi, i = 1, · · · , n

}(6.30)

where Wi is the ith column of strength matrix W around V0 and ‖Wi‖g =√W Ti GWi is an

elliptic norm, G � 0 denotes the coordinate system that may not be perpendicular. We can

always generate a matrix G � 0 to maintain the strict convexity of the problem.

Then the robust counterpart for objective (6.21):

maxx

minµ∈Sm

µTx = maxx

min|ξ|≤γ

(µ0 + ξ)T x = maxx

(µ0 + γ

)Tx


For the constraint (6.22) that measure the portfolio risk:∥∥∥Σ12x∥∥∥2

2≤ σ2 ⇐⇒ xTΣx ≤ σ2 ⇐⇒ xT

(V TFV +D

)x ≤ σ2 ⇐⇒ xTV TFV x+ xTDx ≤ σ2

Then the robust counterpart for above constraint:

maxV ∈Sv ,D∈Sd

xTV TFV x+ xTDx ≤ σ2 ⇐⇒ maxV ∈Sv

xTV TFV x+ maxD∈Sd

xTDx ≤ σ2

⇐⇒

maxV ∈Sv

xTV TFV x ≤ v

maxD∈Sd

xTDx ≤ δ

v + δ ≤ σ2

⇐⇒

maxV ∈Sv

xTV TFV x ≤ v∥∥∥∥∥∥ 2D

1/2x

1− δ

∥∥∥∥∥∥ ≤ 1 + δ

v + δ ≤ σ2

We use the sum of v and δ to represent the total risk since the terms xTV TFV x and xTDx

are independent. For the robust term maxV ∈Sv

xTV TFV x ≤ v, Goldfarb and Iyengar [66] show

that it can be converted into a collection of linear and second-order conic constraints through

Lemma 1 below.

Lemma 1. Let r, v > 0, y0, y ∈ Rm and F , G ∈ Rm×m be positive definite matrices. Then

the constraint

max{y:‖y‖g≤r}

‖y0 + y‖2f ≤ v (6.31)

is equivalent to either of the following:

(i) there exist τ , σ > 0, and t ∈ Rm+ that satisfy

v > τ + eT t

σ ≤ 1

λmax (H)

r2 ≤ στ

w2i ≤ (1− σλi) ti, i = 1, · · · ,m

where QΛQT is the spectral decomposition of H = G−1/2FG−1/2, Λ = diag (λi), and

w = QTH1/2G1/2y0;

(ii) there exist τ > 0, and s ∈ Rm+ that satisfy

r2 ≤ τ(v − eT s

)u2i ≤ (1− τθi) si, i = 1, · · · ,m

τ ≤ 1

λmax (K)


where PΘP T is the spectral decomposition of K = F 1/2G−1F 1/2, Θ = diag (θi), and

u = P TF 1/2y0.

Lemma 1 is proved using the S - procedure which has broad application in engineering

science [26]. For details of the proof of the Lemma 1 see [66]. Therefore by using Lemma 1

constraint maxV ∈Sv

xTV TFV x ≤ v can be transformed into the following convex constraint set by

part (ii) of Lemma 1:

maxV ∈Sv

xTV TFV x ≤ v

⇐⇒ maxV ∈Sv

‖V x‖2f ≤ v

⇐⇒

u = P TF 1/2V0x∥∥∥∥∥∥ 2ρTx

τ − v + eT s

∥∥∥∥∥∥ ≤ τ + v − eT s

∥∥∥∥∥∥ 2ui

v − τθi − si

∥∥∥∥∥∥ ≤ v − τθi + si,∀i = 1, · · · ,m

v − τλmax (K) ≥ 0

τ ≥ 0

(6.32)

where K = PΘP T is the spectral decomposition of K = F 1/2G−1F 1/2, Θ = diag (θ). Function

‖x‖f =√xTFx denotes a norm on Rm. Note that radius r = ρT |x| = ρTx in the first norm

constraint in (6.32) because short selling is prohibited, i.e. x ≥ 0.

For the constraint (6.23) that measure the tracking error:∥∥∥Σ12 (x− xBM )

∥∥∥ ≤ TE⇐⇒ (x− xBM )T Σ (x− xBM ) ≤ TE2

⇐⇒

zTΣz ≤ TE2

z = x− xBM(6.33)

Analogously the robust counterpart of zTΣz ≤ TE2 in (6.33) can be obtained by using

Lemma 1. The associated convex constraints are constructed as follows:

maxV ∈Sv ,D∈Sd

zTV TFV z + zTDz ≤ TE2 ⇐⇒ maxV ∈Sv

zTV TFV z + maxD∈Sd

zTDz ≤ TE2


⇐⇒

maxV ∈Sv

zTV TFV z ≤ l

maxD∈Sd

zTDz ≤ ζ

l + ζ ≤ TE2

⇐⇒

maxV ∈Sv

‖V z‖2f ≤ l∥∥∥∥∥∥ 2D

1/2z

1− ζ

∥∥∥∥∥∥ ≤ 1 + ζ

l + ζ ≤ TE2

⇐⇒

w = P TF 1/2V0z∥∥∥∥∥∥ 2ρT |z|

τ − l + eT s

∥∥∥∥∥∥ ≤ τ + l − eT s

∥∥∥∥∥∥ 2wi

l − τθi − si

∥∥∥∥∥∥ ≤ l − τθi + si, ∀i = 1, · · · ,m

l − τλmax (K) ≥ 0

τ ≥ 0∥∥∥∥∥∥ 2D

1/2z

1− ζ

∥∥∥∥∥∥ ≤ 1 + ζ

l + ζ ≤ TE2

(6.34)

The absolute value sign in the radius r = ρT |z| =∑n

i=1 ρi |zi| in the first norm constraint

in (6.34) should be removed since variable z could be negative. We replace |zi| as follows:

|zi| = z+i + z−i

zi = z+i − z

−i = xi − xBMi

z+i ≥ 0, z−i ≥ 0

Finally the robust counterpart using the factor model for problem (6.21) - (6.27) can be

formulated as follows:

max(µ0 + γ

)Tx (6.35)

s.t. u = P TF 1/2V0x (6.36)

w = P TF 1/2V0

(z+ − z−

)(6.37)

z+ − z− = x− xBM (6.38)∥∥∥∥∥[

2ρTx

τ − v + eT s

]∥∥∥∥∥ ≤ τ + v − eT s (6.39)∥∥∥∥∥[

2ui

v − τθi − si

]∥∥∥∥∥ ≤ v − τθi + si, ∀i = 1, · · · ,m (6.40)


∥∥∥∥∥[

2ρT (z+ + z−)

τ − l + eT s

]∥∥∥∥∥ ≤ τ + l − eT s (6.41)∥∥∥∥∥[

2wi

l − τθi − si

]∥∥∥∥∥ ≤ l − τθi + si, ∀i = 1, · · · ,m (6.42)∥∥∥∥∥[

2D1/2x

1− δ

]∥∥∥∥∥ ≤ 1 + δ (6.43)

v + δ ≤ σ2 (6.44)∥∥∥∥∥[

2D1/2

(z+ − z−)

1− ζ

]∥∥∥∥∥ ≤ 1 + ζ (6.45)

l + ζ ≤ TE2 (6.46)

v − τλmax (K) ≥ 0 (6.47)

l − τλmax (K) ≥ 0 (6.48)

eTx = 1 (6.49)

eT y = q (6.50)

lbiyi ≤ xi ≤ ubiyi, ∀i = 1, · · · , n (6.51)

x ≥ 0, τ ≥ 0, z+ ≥ 0, z− ≥ 0 (6.52)

y ∈ {0, 1} (6.53)

The dimension of different type of variables are x, z+, z− ∈ Rn, v, δ, l, ζ, τ ∈ R, u,w, s ∈ Rm,

y ∈ Bn. Therefore, model (6.35) - (6.53) keeps the same CCCP structure as norminal tracking

model but includes more variables and cone constraints. In practice we apply Fama-French 3

factor model which is an advanced extension of CAPM model to calculate the numerical values

of the parameters, details see [66, 54].

We then test both nominal model (6.21) - (6.27) and the robust counterpart (6.35) - (6.53)

by commercial solver Gurobi on an AMD Dual-Core laptop with 2GB of RAM. One interesting

observation is that Gurobi take much longer running time to model (6.35) - (6.52) than that to

model (6.21) - (6.27) in many instances. For example, for the case that N = 500, q = 70, there

still exist 10% gap between lower and upper bounds after 1000 seconds running time for model

(6.35) - (6.53) while Gurobi return the optimal solution, i.e. gap equals 0, within 10 seconds

for model (6.21) - (6.27), and such instances are common in our testing. The factored robust

procedure for index tracking problem simplify the parameter estimation but it may increase the

solving time since more conic constraints are included into the model (2m + 2 from 2), which


make the problem harder and harder. This disadvantage motivated us to apply Lagrangian

Relaxation method we designed to speed up the solving process and keep the quality of the

solution in next section.

6.5 Computational Experiments

6.5.1 Testing the Three-Factor and Single-Factor models

We use as the basis of the robust factor model the Fama and French 3 factor model [54] which

can be seen as the extension of Sharpe’s one factor CAPM model. The Fama-French 3 factor

model is based on the observation that small capitalization stocks and value stocks (i.e. stocks

with high book to price ratio) tend to outperform the market as a whole. In the model, three

risk factors reflect the sensitivities of each stock to the market excess return (market factor),

the excess of value stocks over growth stocks (book-to-market factor), and the excess of small

cap stocks over large cap stocks (size factor). The one and three-factor models are presented

as follows:

rit − rft = αi + βiM (rMt − rft) + εit (6.54)

rit − rft = αi + βiM (rMt − rft) + βisSMBt + βihHMLt + εit (6.55)

where rit, rft, rMt denote the return of asset i, risk-free return, and market return at time t

respectively; rit− rft, rMt− rft denote excess return of asset i and the excess return of market

return M over the risk-free rate the market at time t, respectively; SMBt denotes the excess

returns of small capitalization stocks over large capitalization stocks at time t; HMLt denotes

the excess return of value stocks over growth stocks at time t; εit denotes the residual term of

asset i at time t. The regression coefficients are:

αi = consistent excess return;

βiM = the sensitivity of stock i to movements of the market;

βis = the sensitivity of stock i to movements in small stocks;

βih = the sensitivity of stock i to movements in value stocks;

To fit the observations, ri − rf , as best as possible, one approach is to minimize ‖εi‖ for

stock i. For the linear regression model min ‖Ax− b‖, we have analytic solution x =(ATA

)−1AT b. Where for single factor model, A = [1, rM − rf ] and b = [ri − rf ] and for three factor


model, A = [1, rM − rf , SMB,HML] and b = [ri − rf ]. After solving for regression coefficients

of (6.54) and (6.55), we calculate R2i = 1− SSresidual,i

SStotal,i= 1− ‖εi‖22

(T−1)var(ri−rf), which represent the

percentage of the variance in the excess return of stock i, to compare how good the estimated

parameters (regression coefficients) fit the observations.

We collected data on 5810 stocks which are trading in the US NYSE and NASDAQ exchanges

to see which factor model is more suitable. The daily prices of the stock were downloaded from

a Bloomberg work station, and the risk factors were downloaded from the data library of

Kenneth French’s web page [2]. The S&P500 index used as the market and 1-month TBill rate

used as the risk free asset in Fama-French three factor model. The stocks without adequate

price information were deleted and then linear regressions were implemented for above models.

Different time periods of historical data were used to test and the R2 value are listed in following

Table 6.1:

Table 6.1: R2 value for the regression models

2007.01.02 - 2010.12.31 2007.01.02 - 2011.12.31 2007.01.02 - 2013.12.31

R2 stock #(single)

stock #(3 factor)

stock #(single)

stock #(3 factor)

stock #(single)

stock #(3 factor)

≥ 90% 1 3 1 4 1 3

≥ 80% 9 12 10 11 10 10

≥ 70% 30 44 34 50 27 38

≥ 60% 175 296 191 341 144 259

≥ 50% 611 853 688 935 558 785

≥ 40% 1260 1560 1392 1647 1204 1477

≥ 30% 2006 2229 2108 2312 1955 2183

≥ 20% 2624 2723 2677 2781 2595 2712

≥ 10% 3078 3177 3135 3251 3084 3187

≥ 0% 4399 4399 4441 4441 4478 4478

From Table 6.1, we can see that the three factor model can explain more of the variability

of excess returns than the single factor model as expected, and our numerical results for the R2

values obtained for the three factor model is higher than that of single factor model for specific

stocks i for different time periods. In general the R2 value associated with a regression with

the three factor model is 5% better on average than the values from the single factor model,

and in best case the three factor model is 25% better than the single factor model. Therefore

we applied the three factor model (6.55) as the basis to form the uncertainty sets of expected

return and covariance for the factor loadings in (6.35) - (6.53) as in [66] . The details of this


construction is in Appendix (C. 1). We also note that any other notable multi-factor models

can better interpret risks can also be applied to our proposed robust factor index tracking

model. For example, D’ecclesia and Zenios [48] showed that 98% of the variability can be

explained via identifying multi risk factors of returns of the Italian bond market. Burmeister et

al. [28] presented a macroeconomic factor model which includes five risk terms in interpreting

the historical stock returns. We did not test for stationarity of returns, and assumed that the

market was stable i.e. F covariance of factors, in the event of non-stationarity of variance this

could affect the estimation of betas (factor loadings). However, our approach was to let this

be handled by the robust optimization over different factor loading matrices captured by the

uncertainty set Sv.

6.5.2 Index Tracking using the S&P100 Index

In this section, we illustrate the factor-based robust enhanced index tracking model by tracking

the S&P100 index. Comparisons of the robust model versus the nominal model illustrate the

benefits of robustness. First, in-sample data about the components S&P100 are collected to

construct the nominal covariance matrix. We collected the historical price information of all

components of S&P100, and calculated the daily return rit =Pi,t−Pi,t−1

Pi,t−1, where Pi,t, Pi,t−1 are

the adjusted closing prices at time t and t− 1. Then, daily returns were used to calculate the

mean returns of assets and covariance matrix of returns of the assets:

µi =1

T

T∑t=1

rit, covij =1

T

T∑t=1

(rit − µi) (rjt − µj)

Daily prices between June 30, 2005 and December 31, 2007 (630 samples) were collected and

used as in-sample data, and daily prices for each end of month between January 1, 2008 and

December 31, 2008 were used to build out-of-samples for the nominal and robust models. Some

stocks in the S&P100 index can be replaced by some other stocks outside of the index since they

may not satisfy the selection criteria of S&P100 in the designed time period, we retrieved the

stocks that were moved out in the time periods used above and obtained the associated price

information. Usually this replacement was rare and the components of S&P100 were stable,

we check the changing history of the composition of the S&P100 and there is no replacement

between June 30, 2005 and December 31, 2008, the period we collected data for. For some


stocks if there is no adequate data from the Bloomberg work station, we deleted that assets

from the index, for example, 7 assets (5% of total market value) were deleted in period of year

2006 and 2007 and 5 assets (2% of total market value) are deleted in year of 2007 and 2008 due

to lack of data, this reduction did not significantly impact the total market value of S&P100.

Table (6.2) lists the tickers we used for our research grouped them across different sectors:

Table 6.2: Ticker symbol across Sectors (SP100)

Sector (total number) Ticker Symbol

1: Consumer Discretionary (12)AMZN, CMCSA, DIS, FOXA, GM, HD,LOW, MCD, NKE, SBUX, TGT, TWX

2: Consumer Staples (8)COST, CVS, FB, KO,MDLZ, PEP, WAG, WMT

3: Energy (10)APA, APC, COP, CVX, DVN,HAL, NOV, OXY, SLB, XOM

4: Financials (14)AIG, ALL, AXP, BAC, BK, BRK/B, COF,GS, JPM, MET, PM, SPG, USB, WFC

5: Health Care (13)ABBV, ABT, AMGN, BAX, BIIB, BMY,GILD, JNJ, LLY, MDT, MRK, PFE, UNH

6: Industrials (13)CAT, EMR, FDX, GD, GE, HON, LMT,MMM, NSC, RTN, UNP, UPS, UTX

7: Information Technology (15)AAPL, ACN, BA, CSCO, EBAY, EMC, GOOG,HPQ, IBM, INTC, MA, MSFT, ORCL, QCOM, TXN

8: Materials (6)CL, DD, DOW,FCX, MO, MON

9: Telecommunications Services (2)V,VZ

10: Utilities (7)C, EXC, F, MS,PG, SO, T

The Fama-French 3 factor model is used to generate the parameters µ0, V0, G, ρi, γi, di and

the associated uncertainty sets for µ and V0 see Appendix (C. 1) for details on the construction

and we set ω = 0.95 which represents the joint confidence level. Figure (6.4) shows the worst

bound for the expected return µ under the given uncertainty set (6.29) and the worst bound

for covariance σi under the given uncertainty set (6.28) and (6.30). We can see that almost all

robust expected returns are below the nominal expected return from the historical data, and

all robust covariances are above the nominal covariance computed from the historical data.


Figure 6.4: Robust bound for expected return and variance (SP100)

Robust v.s. nominal portfolio performance

We then use the computed tracking portfolios to test rolling out-of-samples and compare the

performance of the portfolios. The 4 rolling periods are 2008, 2009, 2010, and 2011 respectively.

The rolling process is described as follows. We select two year’s daily data, e.g. year 2006 and

2007, as in samples to construct the portfolio and then test the next one year’s performance,

e.g. year 2008, without re-balance. After that we replace the in-samples as daily data from year

2007 and 2008, and test portfolio performance in year 2009, and so on. Both nominal model

(6.21) - (6.27) and robust counterparts (6.35) - (6.53) are solved by Gurobi. For the initial test,

we set lbi = 1n and ubi = 0.7. σ equals 8 times of the maximal standard deviation in the assets

in SP100, and TE equals 5 times of standard deviation of SP100. The portfolio size was set at

q = 25.


Figure 6.5: Wealth evolutions for rolling out-of-samples

Figure (6.5) shows the portfolio return evolution for the out-of-sample period, there is

no rebalancing during the out-of-sample test. The returns from portfolios generated by the

robust factor model is reasonably close to the S&P100 index see (6.5) and are relatively stable

without large drops. The portfolio returns generated by the nominal model may be sensitive

to perturbations of the coefficient and exhibits wider divergence in returns. For example, when

the market starts to decrease during time periods 2 to 4, the portfolio generated by the nominal

model drops more rapidly than the index but the robust portfolio exhibits good performance

and actually dominates the performance of the S&P Index during most of this period of market

decline. During time periods 7 to 9, the portfolios by both models avoided the market plunge

and the performance by robust factor model were generally better than that of nominal model.

These examples shown that the robust factor model protected against the uncertainty of market

movement successfully. During periods 8 to 11 a market recovery is seen and the returns from


the robust portfolios actually lag the returns from the S&P 100 index and nominal portfolio,

but then these latter two portfolios drop more steeply in the period from 11 to 12 of decline.

This indicates that robustness protects well against large drops but may not accelerate as fast

in periods of steady market increases.

Similar robust mechanism that protect against downside risk can be seen in other sub-figures

in Figure (6.5). For example, when market rapidly increased in year 2009, 2010 and 2011 which

represent different parameter structures to the models, the portfolios by factor robust model

still displayed the relative stable return performance compared with that from nominal model.

It is clear to see that the path of robust model in period 3 to 5 in third sub-figure moved down

slower than that of nominal model and target index.

Next we vary the portfolio size q from 10 to 75 in increments of 5 and solve both nominal and

robust models under different portfolio sizes. The mixed integer solver in Gurobi for MISOCP

is mainly based on the branch-and-bound algorithm which tries to shrink the gap between

the SOCP relaxed lower bound and its feasible upper bound. For the instances of tracking

S&P100, we set the running time for Gurobi as 100 seconds, relative optimality gap equals 10e-

08. In our computation, the hardest instance consumed 50 seconds to satisfy the gap tolerance,

which indicates all instances can obtain the optimal portfolio within 50 seconds due to the

suitable problem size that Gurobi can quickly handled. The performance metrics include: daily

portfolio return, daily portfolio variance, and daily portfolio Sharpe ratio. We compare these

performance metrics by using in-sample and out-of sample data. There is no re-balancing of

portfolios during a testing period. For example, 630 in-sample daily returns from June 30,

2005 to December 31, 2007 were used to generate data and then traking portfolios were tested

out-of-sample from December 31, 2007 to December 30, 2008 which is a period in which a large

market decline was experience.

The size of uncertainty set is controlled by the joint confidence level ω in equations (C.5)

and (C.6) in Appendix (C. 1). To our experience, for a very high joint confidence level, e.g.

ω = 0.99, we have a high confidence that the solution of robust model protect against the

uncertainty of parameter, but the feasible region of robust model may be restricted and more

instances will be infeasible when portfolio size is small, e.g. q = 15. On the other hand, for a low

joint confidence level, e.g. ω = 0.55, more small size instances have solution but the confidence


that the parameters lie in the designed ellipsoid is low. Therefore, we set a reasonable joint

confidence level ω = 0.95 in our computation. The parameters (µ,Σ) in nominal model (6.21) -

(6.27) are approximated by the three factor model (6.55) where µ = µ0, Σ = V T0 FV0 +D0, and

then are used in the robust model as well. We also used the linear regression to approximate

the out-of-samples and then calculate the associated out-of-sample performance. Figure (6.6) -

(6.12) shows these comparison between two models.

Figure 6.6: Model comparison - portfolio return

It is clear to see the trend that the portfolio returns by the nominal model decreased as size

increased from Figure (6.6). All instances obtained the optimal solution by applying Gurobi

mixed integer solver. The portfolio returns decreased as more the portfolio sizes are allowed

because of the diversification process, that is, the more assets are allocated, the less risk is

taken to the portfolio, and thus the smaller portfolio returns. Meanwhile the portfolio return

by robust model for both in-sample and out-of-sample seem unchanged too much with respect

to portfolio size, however they are generally better than the returns generated by the nominal

models for out-of-sample. Figure (6.6) shown that the robust model can protect against the

downside risk in estimation of expected return vector µ0 due to market uncertainty. We can also


see that portfolio return by robust counterpart in the out-of-sample period (averagely 0.63�)

is better than the index return in the same out-of-sample period (−0.10%).

Figure 6.7: Model comparison - portfolio variance

From Figure (6.7), we can easily see the diversification process of portfolios generated by

nominal model as q gets larger, i.e. as portfolio sizes get larger, the portfolio variance decreased.

Portfolio variance for portfolios generated by the robust model for in-sample and out-of-sample

are lower than that from the nominal model for corresponding in-sample and out-of-sample

periods, which indicates the cardinality constraint had an impact on the conic constraints that

represent the portfolio risk in that variance was reduced. The variance of the S&P100 index

has the lowest value in the in-sample period, and the value in the out-of-sample period is still

lower than robust models due to the diversification effect of of having more assets. The average

portfolio variance by robust model in the out-of-sample period is 0.54� averagely, meanwhile

the SP100 variance equals 0.22� in the same out-of-sample period.


Figure 6.8: Model comparison - portfolio Sharpe ratio

The portfolio Sharpe ratio is defined asE(rport)−E(rf)√

var(rport)where rf is the return of 10 year U.S

Treasury bonds. From Figure (6.8), the Sharpe ratio generated by nominal models decreased

as the portfolio size increased, which means the portfolio return decreased more quickly than

the reduction of portfolio variance across the size. The Sharpe ratio by nominal model for

in-sample are better than that by robust factor model for in-sample, this is reasonable since

robust counterpart consider the worst scenario for parameters. On the other hand, the Sharpe

ratio generated by robust factor models for out-of-sample are better than those generated by

nominal models out-of-sample, this is crucial since we want to reduce the negative effect of

market uncertainty. Therefore, Figure (6.8) indicates that the portfolios generated by robust

models are more stable than those from the nominal models across different portfolio sizes q,

this illustrates the benefit of cardinality constraint in the robust factor model. The average

portfolio Sharpe ratio of portfolios generated by robust models is 0.0176 in the out-of-sample

period and the Sharpe ratio of S&P100 in the same out-of-sample period is −0.0218.


Figure 6.9: Model comparison - Tracking error

After solving both the nominal index tracking model and its factored robust counterparts,

the tracking errors are calculated by (x− xBM )T Σ (x− xBM ), which represent the variance

difference between the portfolio and the target index. As can been seen in Figure (6.9), the

tracking errors by portfolios from the robust model are generally smaller than those from

portfolios generated by the nominal model with respect to size for in-sample and for out-

of-sample. This trend can be guaranteed since we generate the worst scenario bound for the

parameters and the corresponding tracking error by robust model is also the lower bound for the

tracking error by nominal model. We then test the tracking error to transaction costs efficient

frontier that generated by both nominal and robust model. Suppose the initial portfolio wealth

is b0, e.g. one dollar, and trading ratio per dollar is α = 0.5%. From the initial portfolio that

starts at January 1, 2008, we update the tracking error and associated tracking cost due to the

rebalancing of the portfolio per month, and calculate the tracking error to transaction costs

ratio (TE/TC ratio) as follows:

(x− xBM )T Σ (x− xBM )∑i α∣∣b1i − b0i ∣∣ /∑i b

0i

where b1 is the new portfolio wealth before charging the transaction costs, which can be calcu-


lated by b0 (1 + µ)x. We update the in-samples via keeping the same length size when rolling

up along the time horizon. The tracking error to transaction costs ratios for nominal and robust

models displayed in the following Figure (6.10):

Figure 6.10: Tracking Error to Transaction costs ratios (SP100)

The left sub-figure in Figure (6.10) shows the changing of nominal TE/TC ratio with respect

to the size and time periods. From the left sub-figure we see that in some periods (period 1, 4,

6) the TE/TC ratios apparently decreased with respect to the increasing of the portfolio size,

but in some other periods (period 9 and 12) this trend is not obviously. This can be explained

from two points of view. First with the smaller size, the portfolio tracking error may quite

large and dominate the occurrence of the transaction costs. While more assets are allowed to

invest, the tracking error decreased but the transaction costs may become larger, which lead

to an unsmooth decreasing curve for periods 1, 4 and 6. Secondly the portfolio allocation may

be dramatically changed as the market significantly dropped in September 2008, therefore, the

transaction costs may have happened more frequently and pulled the TE/TC curve down. For

example, we see that for any size the TE/TC ratios in period 9 are far lower than that from

period 6. Our numerical result showed that the tracking errors in these two periods keep in the

same order of magnitude but the transaction cost of period 9 is 15 times higher than that in

period 6 on average. Therefore, the nominal TE/TC ratios may be affected by both portfolio

size and the uncertainty from the market.

The right sub-figure, on the other hand, shows the robust TE/TC ratio according to the

size under different rolling periods. In contrast with nominal TE/TC ratio, the robust TE/TC


ratios are nearly non-decreased (see period 1, 4, 9, 12), which indicates that the transaction

costs plays the same important role as the tracking error if we apply the rolling up strategy.

However, our numerical result showed that the tracking errors without rolling up testing keep

the same order of magnitude as that by the rolling up way. Therefore, it is unnecessary to

re-balance the robust portfolio frequently in terms of TE/TC ratio consideration.

Next we investigate the changing of TE/TC ratio with respect to the trading ratio α under

different size. The efficient frontier is exhibited in the figure 6.11, the left and right sides denote

the trend of the efficient frontier under different sizes, i.e. q = 25, 75 represent the different

strength of partial replication respectively, and the upper and lower sides indicate the trend of

efficient frontier by nominal model and its robust counterpart.

Figure 6.11: TE/TC ratios with respect to the trading ratio α

It’s not surprising that all sub-figures followed a similar decreasing pattern corresponding to

the increasing of α because the tracking errors are bounded in both models but the rebalance


cost will keeping rising no matter a swapping occurred or not if the trading ratio α goes up

according to the TE/TC ratio equation we used. A more detailed insight can be seen as follows.

From the upper to the bottom, we see that the TE/TC ratio by the nominal model is generally

higher than that by the robust model for any fixed trading ratio α, the main reason is that the

rebalance cost of the nominal portfolio is much higher than that generated by robust counterpart

(see columns 1 and 3 in Table 6.3). From the left to the right, the nominal TE/TC ratio under

smaller size (q = 25) is larger than that with a size equals 75, while the robust TE/TC ratio at

same smaller size is lower than the corresponding ratio at the same larger size level. To clearly

see the reason, we list the average values of the indicators over the 12 rolling periods under

α = 0.1% in the Table (6.3):

Table 6.3: The average TE/TC ratios under different size

Nominal Robust25 75 25 75

TE 6.1451e-04 2.6017e-04 3.9229e-04 3.2884e-04

TC 5.1146e-04 6.1602e-05 9.6393e-06 1.6758e-05

TE/TC 2.8046 0.4308 0.0302 0.0785

From the Table (6.3), the nominal model generated overall higher transaction costs than

that from the robust counterpart. As more assets are diversified, the average nominal tracking

error reduced quicker than the average reduction of the trading cost, and therefore we see the

sharp jump of the nominal TE/TC ratio with respect to the size. The robust tracking error, on

the other hand, reduced slowly while the transaction costs keep the similar order of magnitude

on average, which may lead to the similar but much smaller TE/TC ratio compared with the

nominal model.

Another way to measure the tracking performance is by the tracking ratio. Similar to the

definition of tracking ratio in Cornuejols and Tutuncu [40], we calculate the tracking ratio

through the following formula:

R0t =MI

MP=

∑ni=1 Vit/

∑ni=1 Vi0∑q

j=1 xjVjt/∑q

j=1 xjVj0

where MI =∑ni=1 Vit∑ni=1 Vi0

indicates the target index’s movement after investment, MP =∑qj=1 xjVjt∑qj=1 xjVj0

denotes the movement of portfolio’s market value during the out-of-sample period. The ideal

tracking ratio, R0t, is 1, a value over 1 means underperformance with respect to the target


index, and a value less than 1 indicates excessive return. Figure (6.12) display the comparison

of tracking ratios of portfolios generated from the nominal and robust models.

Figure 6.12: Model comparison - Tracking ratio

The straight line indicates that a portfolio perfectly tracks the market index, S&P100. There

was no rebalance during the tracking period after investment. From Figure (6.12), the tracking

ratios by robust model are more closer to 1 than that from the nominal model with respect

to size for out-of-sample testing, which indicate the factored robust tracking model has better

tracking performance during period from December 31, 2007 to December 30, 2008, a main

period in financial crisis.


Table 6.4: Tracking ratio comparison

N = 93q

moveindex MI

move port(nomi. MP1)

MI

MP1

∣∣∣ MI

MP1−1∣∣∣ move port

(rob. MP2)MI

MP2

∣∣∣ MI

MP1−1∣∣∣

25 0.6575 0.5819 1.1300 0.1300 0.6632 0.9914 0.008630 0.6575 0.6453 1.0190 0.0190 0.6464 1.0172 0.017235 0.6575 0.6486 1.0137 0.0137 0.6507 1.0105 0.010540 0.6575 0.6452 1.0191 0.0191 0.6630 0.9917 0.008345 0.6575 0.6444 1.0205 0.0205 0.6593 0.9973 0.002750 0.6575 0.6642 0.9899 0.0101 0.6551 1.0037 0.003755 0.6575 0.6634 0.9911 0.0089 0.6547 1.0043 0.004360 0.6575 0.6770 0.9713 0.0287 0.6534 1.0064 0.006465 0.6575 0.6843 0.9608 0.0392 0.6568 1.0011 0.001170 0.6575 0.6728 0.9773 0.0227 0.6552 1.0036 0.003675 0.6575 0.6653 0.9883 0.0117 0.6628 0.9920 0.0080

Aver. 0.6575 0.6539 1.0074 0.0294 0.6564 1.0018 0.0067

After obtaining the portfolios by proposed models, we test the movement of index and

portfolios in out-of-samples period in terms of market value. Table (6.4) shows more details

about the market value movements of index and the portfolios with respect to size. It is

clear to see that the movement of the target index is constant to size while the movement of

portfolios by different models are varying with respect to size. For example, under q = 25,∑ni=1 Vit∑ni=1 Vi0

= 0.6575 indicates that the market value of the index at time t is 65.75% of the market

value of the index at time 0, or the index value decreased 34.25% at the end of the out-of-

sample period. Meanwhile,∑qj=1 x

nomin alj Vjt∑q

j=1 xnomin alj Vj0

= 0.5819 denotes the market value of the nominal

portfolio dropped 41.81% in the same out-of-sample period, and the associated tracking ratio

Rnomin al0t = 0.6575

0.5819 = 1.1300 denotes the speed of the value shrinkage of the nominal portfolio

is faster than that of the index. On the other hand,∑qj=1 x

robustj Vjt∑q

j=1 xrobustj Vj0

= 0.6632 denotes the

market value of the robust portfolio dropped 33.68% at the end of the out-of-sample period,

which indicates the downward descent in terms of market value is 8.13% (41.81% − 33.68%)

less than the descent of the nominal portfolio at the same period, and the associated tracking

ratio Rrobust0t = 0.65750.6632 = 0.9914 denotes the decreasing speed of the market value of the robust

portfolio is also less than the downside speed of the market value of the index market. The

columns with∣∣∣MIMP− 1∣∣∣ values indicate how close is a constructed portfolio to the index, and

the ideal value is 0. As shown in the Table (6.4), the portfolios generated by robust model are

relative closer to the S&P100 compared with those by the nominal model.


6.5.3 Index Tracking using the S&P500 Index

We test the proposed LR method with the estimated parameters from real data in this section.

We applied the same data processing shown in Section 6.5.2 to generate the parameters for the

models. Table (A.1) listed the tickers we used for our research grouped them across different

sectors in Appendix A. We deleted the ticker without enough public data, and retrieved the

ticker if any replacement occurred during selected period. We then solved the model (6.35) -

(6.53) by Gurobi directly and compared the numerical results with that obtained from the LR

method. We changed the portfolio size q from 20 to 300 per 5 interval and solved the instances

one by one. We first listed the gap information for the instances that q ≤ 100, which represents

the practical region, in Table (6.5), then we showed all numerical details in Table (C.1) in

Appendix (C. 2).

Table 6.5: Bounds information (SP500)

qGurobi Obj

[1000 s]LB by LR Fesi. UB

Gap toGurobi

Gap byLR

Timeby LR

20 0.00789386 0.00766973 0.00789601 0.03% 2.87% 1206.31

25 0.00780381 0.00760901 0.00782685 0.30% 2.78% 1790.17

30 0.00775582 0.00760910 0.00777614 0.26% 2.15% 1828.48

35 0.00771529 0.00760903 0.00771885 0.05% 1.42% 1906.24

40 0.00767989 0.00760811 0.00768101 0.01% 0.95% 2103.63

45 0.00766137 0.00765416 0.00766317 0.02% 0.12% 661.12

50 0.00764986 0.00764647 0.00765126 0.02% 0.06% 203.49

55 0.00764921 0.00761658 0.00765333 0.05% 0.48% 2383.81

60 0.00764508 0.00764096 0.00764556 0.01% 0.06% 2163.47

65 0.00764522 0.00764334 0.00764666 0.02% 0.04% 2241.48

70 0.00764938 0.00764359 0.00764980 0.01% 0.08% 3332.34

75 0.00765234 0.00764482 0.00765291 0.01% 0.11% 2009.29

80 0.00765870 0.00764463 0.00766008 0.02% 0.20% 1571.46

85 0.00766614 0.00762793 0.00766623 0.00% 0.50% 1539.80

90* 0.00767435 0.00763362 0.00767417 0.00% 0.53% 1625.92

95 0.00768440 0.00765516 0.00779374 1.42% 1.78% 1345.64

100 0.00769624 0.00764036 0.00771742 0.28% 1.00% 1244.59

Average / / / 0.15% 0.89% 1715.13

The running time by Gurobi was set as 1000 seconds. From Table (6.5), we see that the

solution by LR method is close to the solution form Gurobi, the average gap is 0.15%. Meanwhile

the running time of LR method is slightly longer than the time by Gurobi (averagely 1715 vs

1000). It should not be surprised to see that the LR method can quickly converge to near


optimal solution within a short running time. For instance, for q = 45 and 50, the LR method

consumed no more than half of the time that from Gurobi. The possible reason is that the

generated inequalities (6.9) and (6.10) improved the iteration procedure. We will show this

speed up process soon. The average gap by LR method is 0.89% which indicates the solution

is close to the global optimal. In some instances e.g. q = 90, the objective value by LR method

is slightly better than Gurobi objective.

Next we detailed the comparison of LR method with and without inequalities (6.9) and

(6.10) in the first three sub-figures in Figure (6.13), and showed a more precisely iteration

process by setting different initial dual variable π+ for instance q = 50 in the last sub-figure.

Figure 6.13: Iteration details (SP500)

As shown in Figure (6.13), the LR gaps usually can shrink to the Gurobi solution after

40 - 60 iterations, and LR method with designed cuts (6.9) and (6.10) converged quicker than

LR without such cuts. For example, it only required 20 iterations to reach a small gap by


LR with the cuts while 80 iterations consumed by LR without the cuts to obtain similar gap

scalar. In general, LR with designed cuts can save 60% iterations than that by LR without

the cuts in our computation. Therefore we apply the designed cuts and set the iteration limit

V = 100 for the LR method. Some other parameters for both models and LR methods are set

as follows. xBM is the normalization of the the market capitalization of component in S&P500,

σ = 8∗max (diag (∑

)), TE = 7∗STDS&P500, lb set as 1/n and ub = 1. The gap stop criterion

ε = 1/104, the initial dual variable π− = 0 and π+ = 1, other initial dual value π+ that can

speed up the LR process can be applied. For example, we observe that if some elements in π+

set as 0 and others equal 1, higher precision of solution can be obtained. We then summary

the bounds and gap information by Figure (6.14) for the Table (C.1) in Appendix (C. 2).

Figure 6.14: Bounds and gap comparison by LR method (SP500)

The left side on Figure (6.14) list the lower and upper bounds by LR method with respect to

size. We see that in most instances, solutions by LR method are close to Gurobi. LR method

can generally shrink the gap between the lower and upper bounds under 5% in the range that

q ∈ [20, 200] ∪ [255, 300]. Although the gap trend increased in the range q ∈ [205, 250], the LR

solution still close to Gurobi solution, which indicate high quality solution can be obtained.

Some instances with better objective value have been marked in Table (C.1) in Appendix (C.

2), i.e. q = 90, 125, 215, 240, 275.

From the right side on Figure (6.14) we see that the average gap to Gurobi is 0.21%.

Meanwhile the average gap by LR method is 4.16% and the average solving time by LR methos

is 1500 seconds. 39 out of 57 instances with relative small gap that less than 5%, and 6 out


of 57 instances have large gap that over 10% (worst gap equals 13.08%). These hard instances

mainly lie in the unpractical range q ∈ [220, 250].

6.5.4 Index Tracking using the Russell 1000 Index

The Russell 1000 Index is another important market-cap based index which represents near

90% of the total market capitalization in US equity market. It has been used to build different

index ETFs, e.g. the iShares Russell 1000 Index and the Vanguard Russell 1000 Index ETF.

Because the Russell 1000 Index includes more companies than that in S&P 500, it can broadly

diversify across the whole market but may also be computationally expensive using the partial

replication such as the tracking models we developed in Section 6.4. Therefore we next apply

the LR method described in Section 6.3 for tracking the Russell 1000 Index.

Similar parameter generation process described in Section 6.5.2 was applied for Russell 1000

Index. Table (6.6) listed the comparison between the solution from the LR method and Gurobi.

The running time for both methods were set as 3600 seconds. As can be seen, the gaps by LR

methods are better than the Gurobi gaps for small size q, e.g. q = 35, 50, meanwhile the gaps

of LR method are close to Gurobi gaps for large size q, e.g. q ≥ 95. This is reasonable since

as q increased, the feasible region of robust model are loosed and the both gaps are improved.

Moreover, for the instance that q = 35, 50, 95, The gaps and feasible objective by LR method

are superior to that from Gurobi, which indicates the LR method can converged quicker than

the mixed integer solver based on branch and bound method in Gurobi within the setting time.

Table 6.6: Bounds information (Russell 1000)

q Gurobi LB Gurobi Obj Gurobi Gap LB by LR Fesi. UB LR Gap

35* 0.00894408 0.00969068 7.7043% 0.00897855 0.00945787 5.0680%

50* 0.00894609 0.00950895 5.9193% 0.00890467 0.00924792 3.7117%

65 0.00894726 0.00913377 2.0420% 0.00890117 0.00913395 2.5485%

80 0.00894703 0.00911471 1.8397% 0.00894317 0.00911878 1.9258%

95* 0.00895056 0.00915830 2.2683% 0.00889604 0.00907312 1.9517%

110 0.00894968 0.00904164 1.0171% 0.00895971 0.00904435 0.9358%

125 0.00895085 0.00908289 1.4537% 0.00897209 0.00907164 1.0974%

140 0.00894871 0.00902281 0.8212% 0.00900732 0.00902415 0.1864%

Average / / 2.8832% / / 2.1782%


6.5.5 Index Tracking using the Russell 3000 Index

In this section, we apply our LR method to test Russell 3000 Index, which represents approx-

imately 98% of the investable US equity market. Similarly categorized the S&P 500 in Table

(A.1), the assets of Russell 3000 are selected from 10 sectors but with different ticker symbols

and sector weights. After deleting the assets without adequate data for the factor based robust

index tracking model (6.35) - (6.53), the total number of assets remains as 2359 which accounts

95% of the index value. We now set σ = max (diag (∑

)), TE = 4 ∗ STDR3000, and other

parameters keep the same as we did for S&P500. We first showed the Gurobi iteration details

for solving the model under different q in Figure (6.15).

Figure 6.15: Gurobi iteration details for different size q

We set the running time for Gurobi as 6 hours, 2 instances (q = 30 and 50) used up

the running time and other 4 instances (q = 70, 110, 150 and 190) terminated due to out of

memory. It is clear to see that the gap cannot be significantly improved in our computation after

5000 seconds. For example, we found that for instances q = 30, the boundary gap remained

unchanged as 18% after 3600 seconds. In some other instances (q = 50), the boundary gap


after 1 hour and 6 hours running was 13.1% and 12.1% respectively, which indicated 0.2%

improvement per hour. One more unexpected question is that most of the instances encountered

memory capacity problem and some instances still leave large gaps before the solver crashed,

e.g. gap equals 23% for q = 70 in the figure. However, our decomposition-based LR method

does not have this issue. Based on the experience on Figure (6.15), we set the running time

for the solver as 7200 seconds. After solving the model by both approaches, we calculated the

relative gaps equal the difference between the upper and lower bounds divided by the upper

bound, and the gaps to Gurobi solution by using the difference between the LR and Gurobi

feasible objectives to divide the Gurobi feasible objective value. We listed the computational

results in Table (6.7) as follows:

Table 6.7: Bounds information (Russell 3000), TE=4STD

q Gurobi (7200 s) LR method Gap to Time byLB (1e-03) UB (1e-03) Gap LB (1e-03) UB (1e-03) Gap Gurobi LR

20 2.0038 2.7278 26.54% 2.5625 2.8059 8.67% 2.86% 1558.4

30 2.0019 2.4402 17.96% 2.3581 2.5479 7.45% 4.41% 1754.5

40 2.1246 2.314 8.18% 2.1469 2.4234 11.41% 4.73% 1726.6

50 2.0015 2.2761 12.06% 2.1420 2.3418 8.53% 2.89% 2054.7

60 2.0513 2.197 6.63% 2.1613 2.2852 5.42% 4.02% 2175.4

70 2.0190 2.6231 23.03% 2.0907 2.2312 6.30% -14.94% 2131.3

80 2.015 2.1398 5.83% 2.0668 2.1783 5.12% 1.80% 2180.3

90 2.0359 2.1248 4.19% 2.0444 2.1412 4.52% 0.77% 2368.8

100 2.0409 2.1321 4.28% 2.0014 2.1199 5.59% -0.57% 2566.1

110 2.0396 2.1560 5.40% 2.0012 2.1062 4.99% -2.31% 2646.3

120 2.0064 2.0969 4.32% 2.0224 2.0970 3.56% 0.00% 2776.3

130 2.0135 2.0943 3.86% 2.0223 2.0901 3.24% -0.20% 2954.7

140 2.0081 2.083 3.60% 2.0015 2.0757 3.58% -0.35% 3026.9

150 2.0393 2.0841 2.15% 2.0222 2.0665 2.14% -0.85% 2832.4

160 2.0099 2.0805 3.39% 2.0223 2.0791 2.73% -0.07% 3165.8

170 2.0141 2.0654 2.48% 2.0013 2.0535 2.54% -0.58% 3264.2

180 2.0133 2.0683 2.66% 2.0220 2.0478 1.26% -0.99% 3289.7

190 2.0149 2.1192 4.92% 2.0218 2.0431 1.04% -3.59% 3117.3

200 2.0180 2.0618 2.12% 2.0015 2.0410 1.93% -1.01% 3150.0

Aver. 2.0248 2.2044 7.56% 2.0901 2.1987 4.74% -0.21% 2565.3

The average running time by LR method is around 2500 seconds which represents 65 percent

of running time saving. The average gaps are 7.56% by Gurobi solver and 4.74% by our LR

method, and the feasible solutions are close each other (-0.21% on average). Specifically, in

the range q ≤ 100, our LR method can generally obtain the smaller gaps and better feasible


solutions compared with that from Gurobi. Although there exists a larger boundary gap for

the instance q = 40, our LR method generated better lower and upper bounds, and the LR

feasible solution is 4.73% better than that by Gurobi. For instance q = 70 on the other hand, we

obtained smaller gap but worse feasible solution which probably because the prosolve process

of the solver generated a high quality initial solution. Regarding to the range 110 ≤ q ≤ 200,

both methods returned similar gaps and feasible solution, the possible reason is that when

larger portfolio size is allowed, both methods approached the optimal or near-optimal solution

within the setting time or iterations, and the convergence became slowly and slowly. Now if

we shrank the tracking error TE = 3 ∗STDR3000 and other parameters remain same, we found

that both methods are infeasible at q ≤ 140, and we showed the computational results for range

140 ≤ q ≤ 200 in Table (6.8):

Table 6.8: Bounds information (Russell 3000), TE=3STD

q Gurobi (7200 s) LR method Gap to Time byLB (1e-03) UB (1e-03) Gap LB (1e-03) UB (1e-03) Gap Gurobi LR

140 3.1183 3.3565 7.10% 3.2091 3.3956 5.49% 1.17% 1919.4

150 3.1135 3.3351 6.64% 3.1745 3.3650 5.66% 0.90% 1901.1

160 3.1106 3.3416 6.91% 3.2078 3.3513 4.28% 0.29% 1873.8

170 3.1131 3.3005 5.68% 3.1738 3.3318 4.74% 0.95% 1898.4

180 3.1145 3.3121 5.97% 3.1735 3.3117 4.17% -0.01% 1883.1

190 3.1117 3.2895 5.41% 3.1730 3.2936 3.66% 0.12% 1890.6

200 3.1219 3.2731 4.62% 3.1072 3.2623 4.75% -0.33% 1985.3

Aver. 3.1148 3.3155 6.05% 3.1741 3.3302 4.68% 0.44% 1907.4

As shown in Table (6.8), the LR method have constant better performance than Gurobi’s

in terms of consuming time and boundary gaps. Our LR method saved 47% of running time on

average to obtain similar gaps that Gurobi achieved (4.68% vs 6.05%). More importantly, 11

out of 26 instances in Table (6.7) and (6.8) returned better objective values (at least 1% better)

by LR approach, which indicates the developed LR method is efficient for solving CCCP and our

method can be seen as complementary to branch and cut based algorithm. In a nutshell, our LR

method is much quicker than Gurobi to generate a better solution and accociated acceptable

boundary gap for practical smaller size, and our LR method can also return a near-optimal

solution within a reasonable time for larger portfolio size, e.g. within a reasonable time q = 200

for tracking Russell 3000.



We designed a Lagrangian decomposition approach for the proposed CCCP in this section. We

also generated two types of valid cuts that can speed up the LR algorithm. Index tracking

problem can be seen as one application of CCCP framework. A factor-based robust enhanced

index tracking model was developed and a robust three factor model of risk of Fama and French

was used as the basis of constructing robust counterparts of the nominal tracking model. We

highlight our contributions as follows. First, computational results using the S&P100 index

as a benchmark have shown that the robust counterpart has better tracking performance and

Sharpe ratios than portfolios generated by nominal models out-of-sample. Second, computa-

tional results from tracking the S&P 500, Russell 1000 and Russell 3000 demonstrated the

effectiveness for the class of CCCP problem we considered. That is, (1) the feasible solution

by the LR method is at least close to the solution from Gurobi; (2) the average gap by the

LR method is lower than that by Gurobi (see tracking Russell 1000 and Russell 3000), better

solution can be obtained in some instances (see tracking Russell 3000). Extending the proposed

LR method to different types of problem, e.g. robust p-median problem, will be the subject of

future research.

Chapter 7

Conclusion and Future Research

7.1 Conclusion

In this thesis index tracking and cardinality constrained financial planning problems under

uncertain environment were studied through different modelling approaches. Different models

involved different investment goals and restrictions but each of the models incorporated the

same type of cardinality constraints. As described in Chapter 1, portfolio selection models with

cardinality constraints as a part of their decision support system are considered reasonably in

practice but NP -hard. To best understand and provide the insights to the developed models, the

LR-based algorithms with specific heuristic were applied to deal with computational treats and

generate the optimal portfolios in associated chapters. Therefore, the main contribution in this

document is that we investigate cardinality constrained portfolio selection models and provide

a detailed analysis of three applications for which mathematical programming and financial

modelling have been closely combined together to produce effective solving methodologies and

managemental strategies. All these work can be used to support the one-fund theorem in

practice. We summarize our main outcomes and results of this thesis involves the design and

implementation as follows:

� We studied different portfolio selection models which contain a comprehensive set of prac-

tical managing constraints. Among these managing characteristics, limiting the portfolio

size proved to be the most difficult and drew the largest attention in the design. For

example, in Chapter 4 we incorporated the cardinality, buy-in threshold, turnover and

136

Chapter 7. Conclusion and Future Research 137

sector limit constraints into one index tracking model, in Chapter 5 we learned the car-

dinality, cash flow re-balance, and transaction costs constraints together in a stochastic

programming framework, and in Chapter 6 we considered the cardinality, portfolio risk

control and tracking error constraints into a robust index tracking model. Our detailed

investigation of practical constraints offered a much clearer insight into the behaviour of

portfolio management.

� We investigated two different approaches to capture numerous financial uncertainties in-

volved with security return, risk, and other investment goals. in Chapter 5 we used the

stochastic mixed integer programming technology to facilitate future uncertainties related

to asset returns and index values, while in Chapter 6 we applied the robust optimization

modelling structure to protect against the model parameter uncertainties included asset

returns and variances. Our numerical results based on real data showed that both tech-

nologies can deal with the parameter uncertainty issue derive from the market volatility

fairly well.

� We efficiently solved the portfolio selection models constructed in 4 to 6 by using a unified

dual decomposition framework which embedded specific heuristic. In Chapter 4 we applied

the Variable Neighborhood Search heuristic to obtain high-quality solutions for the index

tracking problem by utilizing the bound information from the Semi-Lagrangian relaxation.

In Chapter 5 we used the Progressive Hedging algorithm that allows designed Tabu Search

and LR sub-solvers be embedded to generate the solution for cardinality constrained

financial planning problems. In Chapter 6 we also applied the LR algorithm to decompose

the factor based robust index tracking problem and generated the high-quality solutions.

Overall, our competitive results with respect to various benchmarks showed that the

effectiveness of the LR methods and can be used as an alternative of handling the solution

for the large-scale applications.

Further investigations will provide more insight towards these approaches and the results

may improve or broaden the scope of this document. To fully exploit different advantageous

characteristics of the proposed models, there are several other features of the developed models

can be highlighted in next section.


7.2 Future Research

In this section we discuss different future directions that may extend or continue to develop

based on this document and the studies in the field of financial engineering and optimization.

7.2.1 Modelling discussion

One further modelling development is to deal with the uncertain parameters for the model

in Chapter 4. Robust optimization is an applicable approach that may be integrated or used

independently to model the problems considered. The robust counterpart for objective function

(4.1) can be formulated as follows:

max α (7.1)

s.t. maxx

minρ

∑n

i=1

∑n

j=1ρijxij > α (7.2)

The tractability of the robust counterpart (7.1) - (7.2) depends on the structure of uncertain-

ty set. For example, the robust version will maintain linear form integer programming if we as-

sume the uncertain ρij lies in a box type of perturbation set, i.e. ρij ∈{ρij + ςij ρij | ‖ςij‖∞ ≤ 1

}.

However, such formulation may be too conservative to obtain enough manageable flexibility.

Thus elliptical uncertainty set, ‖ς‖2 ≤ 1, is more reasonable but it is hard to get the statistical

property of ρij directly. To overcome this drawback, one strategy is to calculate the statistical

property of ρij in the transformation space by Fisher z-transformation [57]. Let

zij =1

2ln

(1 + ρij1− ρij

)(7.3)

Suppose that rT = (r1, r2, · · · , rn) ∼ N (µ,Σ), and observation (r1t, r2t, · · · , rnt) are inde-

pendent for t = 1, · · · , T , then random variable z ∼ N(

12 ln

(1+ρ1−ρ

), 1T−3

)where ρ is the true

correlation coefficient and T is sample size. Building the robustness for z is relative easier than

that for ρ, and we can retrieve ρ by setting

ρij =e2zij − 1

e2zij + 1(7.4)

Then substituting (7.4) into (4.1):


max

n∑i=1

n∑j=1

e2zij − 1

e2zij + 1xij

⇐⇒ maxn∑i=1

n∑j=1

(1− 2

e2zij + 1

)xij

⇐⇒ maxn∑i=1

n∑j=1

xij + maxn∑i=1

n∑j=1

(− 2

e2zij + 1

)xij

⇐⇒ n+ maxn∑i=1

n∑j=1

(− 2

e2zij + 1

)xij

⇐⇒ max

n∑i=1

n∑j=1

(− 2

e2zij + 1

)xij

⇐⇒ min

n∑i=1

n∑j=1

(2

e2zij + 1

)xij

Note thatn∑i=1

n∑j=1

xij = n by summing the constraint (4.3) n times. Now model (4.1) - (4.5)

is equivalent to:

min

n∑i=1

n∑j=1

Zijxij (7.5)

s.t.n∑j=1

yj = q (7.6)

n∑j=1

xij = 1, ∀i = 1, · · · , n (7.7)

xij ≤ yj ,∀i = 1, · · · , n, j = 1, · · · , n (7.8)

xij , yj ∈ {0, 1} (7.9)

where Zij = 2

e2zij+1. Numerical results on SP100 have shown that the solution of model (7.5)

- (7.9) are exactly same with the solution of the basic index tracking model (4.1) - (4.5). The

relationship of robustness of parameter between the models shown in following theorem:

Theorem 1. Building robustness for parameter Zij of model (7.5) - (7.9) in Fisher z-transformation

space is equivalent to build the robustness for parameter ρij of model (4.1) - (4.5) in original

space.

Proof. zij = 12 ln

(1+ρij1−ρij

)⇐⇒ 2zij = ln

(1+ρij1−ρij

)


⇐⇒ e2zij + 1 =1+ρij1−ρij + 1 = 2

1−ρij

⇐⇒ 2

e2zij+1= Zij = 2

1−ρij2 = 1− ρij

Zij and ρij are one-to-one corresponding relation, so robust solution for model (7.5) - (7.9)

is equal the robust solution for model (4.1) - (4.5).

Now the remaining task is to study the robust counterpart for model (7.5) - (7.9). Suppose

that random vector rT = (r1, r2, · · · , rn) ∼ N (µ,Σ) is a multivariate normal distribution, then

z ∼ N

(1

2ln

(1 + ∆

1−∆

),

1

T − 3

)2z ∼ N

(ln

(1 + ∆

1−∆

),

4

T − 3

)e2z ∼ logN

(ln

(1 + ∆

1−∆

),

4

T − 3

)e2z + 1 ∼ logN

(ln

(1 + ∆

1−∆

)+ 1,

4

T − 3

)= logN

(ln e

(1 + ∆

1−∆

),

4

T − 3

)1

e2z + 1∼ logN

(− ln e

(1 + ∆

1−∆

),

4

T − 3

)= logN

(ln

(1−∆

e (1 + ∆)

),

4

T − 3

)2

e2z + 1∼ logN

(ln

(1−∆

e (1 + ∆)

)+ ln 2,

4

T − 3

)= logN

(ln

(2 (1−∆)

e (1 + ∆)

),

4

T − 3

)Therefore Zij = 2

e2zij+1has a log-normal distribution with mean ln

(2(1−∆ij)e(1+∆i)

)and variance

4T−3 . However, the objective (7.5)

∑ni=1

∑nj=1 ρijxij , which represent the linear combination

of log-normal distribution, in general is not a log-normal distribution. Thus the probability

constraint is unlikely to apply to objective function (7.5). Here we build the robustness for

objective (7.5) by following the deriving steps in [26]:

Given the current observation Zij under current ρij , the real output may perturb Zij around

Zij with a probability. We can then describe for any Zij lie in following ellipsoid εij with a

center Zij and P ∈ Rn×n, ς ∈ Rn×n:

Zij + Zij ∈ εij ={Zij + Pijςij

∣∣ ‖ς‖2 ≤ 1}

where Pij is the standard deviation and ςij is the length for component Zij in εij . Any weight

matrix X = [xij ] in ellipsoid εij can be mapped by the relationship:

ςij =

{P TX

}ij

‖P TX‖2


in which PX is the dot product of matrix P and X. For the objective (7.5):

minx

maxZ

n∑i=1

n∑j=1

Zijxij

= minx

supZij+Zij∈εij

n∑i=1

n∑j=1

(Zij + Zij

)xij

= minx

sup‖ςP ‖2≤1ij

n∑i=1

n∑j=1

Zijxij + ςTP PTX

= min

x

n∑i=1

n∑j=1

Zijxij +XTPP TX

‖P TX‖2

= min

x

n∑i=1

n∑j=1

Zijxij +

∥∥P TX∥∥‖P TX‖2

= min

x

n∑i=1

n∑j=1

Zijxij +∥∥P TX∥∥

2

Then the robust counterpart of model (7.5) - (7.9) can be formulated as follows:

min φ (7.10)

s.t.n∑i=1

n∑j=1

Zijxij +∥∥P TX∥∥

2≤ φ (7.11)

n∑j=1

yj = q (7.12)

n∑j=1

xij = 1, ∀i = 1, · · · , n (7.13)

xij ≤ yj ,∀i = 1, · · · , n, j = 1, · · · , n (7.14)

xij , yj ∈ {0, 1} , φ ∈ R (7.15)

where Z and P are the mean value vector and standard deviation matrix of random variable

Z. Since Z ∼ logN(

ln(

2(1−∆)e(1+∆)

), 4T−3

), we can get Z = ln

(2(1−∆)e(1+∆)

)and P = 4

T−3I in the

robust index tracking model (7.10) - (7.15). Solving the proposed robust index tracking model

is non-trivial but the LR methods can be still applied to obtain the bound information.

Although the selection models presented in Chapter 5 and Chapter 6 were developed to

capture the important characteristics of the portfolio selection under uncertain environment,


it would be interesting to incorporate some more detailed aspects which we investigated in

Chapter 4 into the models. For instance, sector limit constraint can be considered into the

Financial Planning problem to simplify the network structure. We test the TE/TC ratio to

show the advantage of the factor based robust model for out-of-sample testing in Section (6.5.2),

but we can also incorporate the transaction costs constraint into the developed index tracking

models and formulate the problem as one whole optimization program. Another aspect that

can further be studied is that applying different scenario generation techniques such as Monte

Carlo Simulation to mimic the parameter uncertainty of the stochastic financial planning model.

As mentioned in the abstract, the models we developed can also extend to other management

applications, for example, the factor based robust model could be used to study the facility

location problem where decision maker needs to determine the location of the potential facilities

so that the uncertain demand can be satisfied.

7.2.2 Algorithm discussion

We have developed three LR-based decomposition algorithms which embedded different specific

heuristics to solve a set of NP -hard problems. Although we have compared our methods with the

most well-known MIP solver which based branch and bound algorithm with sophisticated cuts,

it is possible to improve the results and computational time via combining different information

technologies. The first development direction is to embed a Message Passage Interface (MPI)

code to parallelize the sub-problems so that the computational time can be significantly reduced.

This step is particularly useful to Progressive Hedging algorithm in which there exist numerous

scenarios for a parameter in financial models. The second direction involves decomposition

strategy from different angles, we applied the dual decomposition through this document but

the primal decomposition in [139] might offer additional insights for the understanding of the

models. Finally, we can combine the LR framework and different cutting-planes to speed up

the convergence process in Chapter 6.

Cardinality constraint studied in this thesis is one important approach to limit the port-

folio size. Another alternative to obtain sparse portfolio is norm regularization. For example,

Burmeister et al. [27] applied the trading budget constraint which can be represented by `1-norm

to approximate the cardinality constraint, i.e., ‖x‖0 ≤ K ⇐⇒ ‖x‖1 ≤ ε where ‖x‖0 =∑

i |xi|0.


The authors tested different alternative trading costs and found that the size in the low to

median range level generally has a smaller replicating error. To achieve a designed portfolio

size, we can adjust the penalty parameters in an algorithm via solving a sequence of continuous

approximation and obtain a suitable budget ε. For example, with `1-norm regularization the

developed index tracking model in Chapter 6 can be reduced to the SOCPs, which can be

efficiently handled by interior point method, in each iteration and finally generates a sparse

portfolio that closes to or equals the required size. Therefore, this method can also handle

the large-scale computation and can be used as a potential comparison benchmark for our LR

method in this thesis.

Bibliography

[1] us.spindices.com.

[2] http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html.

[3] http://www.standardandpoors.com/es_LA/web/guest/home, June 2011.

[4] C.J. Adcock and N. Meade. A simple algorithm to incorporate transactions costs in

quadratic optimisation. European Journal of Operational Research, 79(1):85 – 94, 1994.

[5] C Alexander, A Dimitriu, and A Malik. Indexing and statistical arbitrage - tracking error

or cointegration? Journal of Portfolio Management, 31(2):50–63, 2005.

[6] F. Alizadeh and D. Goldfarb. Second-order cone programming. Mathematical Program-

ming, 95(1):3–51, 2003.

[7] E.D. Andersen, C. Roos, and T. Terlaky. On implementing a primal-dual interior-point

method for conic quadratic optimization. Mathematical Programming, 95(2):249–277,

2003.

[8] Alper Atamturk and Vishnu Narayanan. Conic mixed-integer rounding cuts. Mathemat-

ical Programming, 122(1):1–20, 2010.

[9] Mokhtar S. Bazaraa, Hanif D. Sherali, and C. M. Shetty. Nonlinear Programming: Theory

And Algorithms. Wiley-Interscience, May 2006.

[10] J.E. Beasley, N. Meade, and T.-J. Chang. An evolutionary heuristic for the index tracking

problem. European Journal of Operational Research, 148(3):621 – 643, 2003.

144

us.spindices.com

http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

http://www.standardandpoors.com/es_LA/web/guest/home

BIBLIOGRAPHY 145

[11] C. Beltran, C. Tadonki, and J.Ph. Vial. Solving the p-median problem with a semi-

lagrangian relaxation. Computational Optimization and Applications, 35(2):239–260,

2006.

[12] Aharon Ben-Tal, Tamar Margalit, and Arkadi Nemirovski. Robust modeling of multi-

stage portfolio problems. In Hans Frenk, Kees Roos, Tams Terlaky, and Shuzhong Zhang,

editors, High Performance Optimization, volume 33 of Applied Optimization, pages 303–

328. Springer US, 2000.

[13] Aharon Ben-Tal and Arkadi Nemirovski. Robust solutions of linear programming prob-

lems contaminated with uncertain data. Mathematical Programming, 88(3):411–424, 2000.

[14] Aharon Ben-Tal and Arkadi Nemirovski. On polyhedral approximations of the second-

order cone. Mathematics of Operations Research, 26(2):pp. 193–205, 2001.

[15] Hande Y. Benson and Umit Saglam. Mixed-Integer Second-Order Cone Programming: A

Survey, chapter 3, pages 13–36.

[16] HandeY. Benson and Umit Saglam. Smoothing and regularization for mixed-integer

second-order cone programming with applications in portfolio optimization. In Luis F. Zu-

luaga and Tamas Terlaky, editors, Modeling and Optimization: Theory and Applications,

volume 62 of Springer Proceedings in Mathematics & Statistics, pages 87–111. Springer

New York, 2013.

[17] P. Beraldi, A. Violi, and F. De Simone. A decision support system for strategic asset

allocation. Decis. Support Syst., 51(3):549–561, June 2011.

[18] D.P. Bertsekas. Constrained Optimization and Lagrange Multiplier Methods. Athena

scientific series in optimization and neural computation. Athena Scientific, 1996.

[19] Dimitris Bertsimas, David B. Brown, and Constantine Caramanis. Theory and applica-

tions of robust optimization. SIAM Review, 53(3):464–501, 2011.

[20] Dimitris Bertsimas, Christopher Darnell, and Robert Soucy. Portfolio construction

through mixed-integer programming at grantham, mayo, van otterloo and company. In-

terfaces, 29(1):49–66, 1999.

BIBLIOGRAPHY 146

[21] Dimitris Bertsimas and Dessislava Pachamanova. Robust multiperiod portfolio manage-

ment in the presence of transaction costs. Comput. Oper. Res., 35(1):3–17, January 2008.

[22] Dimitris Bertsimas and Romy Shioda. Algorithm for cardinality-constrained quadratic

optimization. Computational Optimization and Applications, 43(1):1–22, May 2009.

[23] Daniel Bienstock. Computational study of a family of mixed-integer quadratic program-

ming problems. In Egon Balas and Jens Clausen, editors, Integer Programming and

Combinatorial Optimization, volume 920 of Lecture Notes in Computer Science, pages

80–94. Springer Berlin Heidelberg, 1995.

[24] John R Birge and Francois Louveaux. Introduction to stochastic programming. Springer

Science & Business Media, 2011.

[25] F Black and R. Litterman. Asset allocation: Combining investor views with market

equilibrium. The Journal of Fixed Income, 1(2):7–18, 9 1991.

[26] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University

Press, New York, NY, USA, 2004.

[27] C. Burmeister, H. Mausser, and O. Romanko. Using trading costs to construct better

replicating portfolios. Enterprise Risk Management Symposium Monograph, Society of

Actuaries, Schaumburg, IL, 2010.

[28] Edwin Burmeister, Richard Roll, and Stephen A. Ross. Using macroeconomic factors to

control portfolio risk. Technical report, 2003.

[29] N.A. Canakgoz and J.E. Beasley. Mixed-integer programming approaches for index track-

ing and enhanced indexation. European Journal of Operational Research, 196(1):384 –

399, 2009.

[30] M.T. Cezik and G. Iyengar. Cuts for mixed 0-1 conic programming. Mathematical Pro-

gramming, 104(1):179–202, 2005.

[31] TJ Chang, N Meade, JE Beasley, and YM Sharaiha. Heuristics for cardinality constrained

portfolio optimisation. Computers & Operations Research, 27:1271–1302, 2000.

BIBLIOGRAPHY 147

[32] Luis Chavez-Bedoya and John Birge. Index tracking and enhanced indexation using

a parametric approach. Journal of Economics, Finance and Administrative Science,

19(36):19–44, 2014.

[33] C. Chen, X. Li, C. Tolman, S. Wang, and Y. Ye. Sparse portfolio selection via quasi-norm

regularization. Technical report, Department of Management Science and Engineering,

Stanford University, USA, December 2013.

[34] Chen Chen and Roy H. Kwon. Robust portfolio selection for index tracking. Computers

& Operations Research, 39(4):829 – 837, 2012.

[35] Fernando Chiyoshi and Roberto D. Galvao. A statistical analysis of simulated annealing

applied to the p-median problem. Annals of Operations Research, 96(1-4):61–74, 2000.

[36] Vijay K. Chopra and William T. Ziemba. The effect of errors in means, variances, and

covariances on optimal portfolio choice. The Journal of Portfolio Management, 19:6–11,

1993.

[37] Thomas F Coleman, Yuying Li, and Jay Henniger. Minimizing tracking error while

restricting the number of assets. Journal of Risk, 8(4):33 – 55, 2006.

[38] Alberto Colorni, Marco Dorigo, Vittorio Maniezzo, and Marco Trubian. Ant system for

job-shop scheduling. Belgian Journal of Operations Research, Statistics and Computer

Science, 34(1):39–53, 1994.

[39] Thomas H. Cormen, Clifford Stein, Ronald L. Rivest, and Charles E. Leiserson. Intro-

duction to Algorithms. McGraw-Hill Higher Education, 2nd edition, 2001.

[40] G. Cornuejols and R. Tutuncu. Optimization Methods in Finance. Mathematics, Finance

and Risk. Cambridge University Press, 2006. pp. 217 - 221.

[41] Gerard Cornuejols, Marshall L. Fisher, and George L. Nemhauser. Location of bank ac-

counts to optimize float: An analytic study of exact and approximate algorithms. Man-

agement Science, 23(8):pp. 789–810, 1977.

[42] IBM ILOG CPLEX. User’s Manual for CPLEX, 2014.

BIBLIOGRAPHY 148

[43] Teodor Gabriel Crainic, Xiaorui Fu, Michel Gendreau, Walter Rei, and Stein W. Wal-

lace. Progressive hedging-based metaheuristics for stochastic network design. Networks,

58(2):114–124, 2011.

[44] TeodorG. Crainic, Michel Gendreau, Patrick Soriano, and Michel Toulouse. A tabu search

procedure for multicommodity location/allocation with balancing requirements. Annals

of Operations Research, 41(4):359–383, 1993.

[45] Y. Crama and M. Schyns. Simulated annealing for complex portfolio selection problems.

European Journal of Operational Research, 150(3):546 – 571, 2003. Financial Modelling.

[46] X.T. Cui, X.J. Zheng, S.S. Zhu, and X.L. Sun. Convex relaxations and miqcqp reformula-

tions for a class of cardinality-constrained portfolio selection problems. Journal of Global

Optimization, 56(4):1409 – 1423, 2013.

[47] Alexandre d’Aspremont, Laurent El Ghaoui, Michael I. Jordan, and Gert R. G. Lanckri-

et. A direct formulation for sparse pca using semidefinite programming. SIAM Review,

49(3):434–448, 2007.

[48] Rita L. D’ecclesia and Stavros A. Zenios. Risk factor analysis and portfolio immunization

in the italian bond market. The Journal of Fixed Income, 4(2):51–58, 1994.

[49] Sarah Drewes and Sebastian Pokutta. Symmetry-exploiting cuts for a class of mixed-0/1

second-order cone programs. Discrete Optimization, 13:23 – 35, 2014.

[50] Sarah Drewes and Stefan Ulbrich. Subgradient based outer approximation for mixed

integer second order cone programming. In Jon Lee and Sven Leyffer, editors, Mixed

Integer Nonlinear Programming, volume 154 of The IMA Volumes in Mathematics and

its Applications, pages 41–59. Springer New York, 2012.

[51] E. Erdogan, D. Goldfarb, and G. Iyengar. Robust portfolio management. Tech. Report

CORC TR-2004-11, IEOR, Columbia University, New York, 2004.

[52] E. Erdougan and G. Iyengar. An active set method for single-cone second-order cone

programs. SIAM Journal on Optimization, 17(2):459–484, 2006.

BIBLIOGRAPHY 149

[53] James D. MacBeth Eugene F. Fama. Long-term growth in a short-term market. The

Journal of Finance, 29(3):857–885, 1974.

[54] Eugene Fama and Kenneth French. Common risk factors in the returns on stocks and

bonds. Journal of Financial Economics, 33(1):3–56, 1993.

[55] Eugene F. Fama, Lawrence Fisher, Michael C. Jensen, and Richard Roll. The adjustment

of stock prices to new information. International Economic Review, 10(1):1–21, 1969.

[56] Marshall L. Fisher. The lagrangian relaxation method for solving integer programming

problems. Management Science, 50(12):pp. 1861–1871, 2004.

[57] R. A. Fisher. Frequency distribution of the values of the correlation coefficient in samples

from an indefinitely large population. Biometrika, 10(4):507–521, 1915.

[58] Dinakar Gade, Gabriel Hackebeil, Sarah M. Ryan, Jean-Paul Watson, Roger J-B Wets,

and David L. Woodruff. Obtaining lower bounds from the progressive hedging algorithm

for stochastic mixed-integer programs. Mathematical Programming manuscript, 2013.

[59] Alexei A. Gaivoronski, Sergiy Krylov, and Nico van der Wijst. Optimal portfolio selection

and dynamic benchmark tracking. European Journal of Operational Research, 163(1):115

– 131, 2005.

[60] Laura Galli and AdamN. Letchford. A compact variant of the qcr method for quadratically

constrained quadratic 0-1 programs. Optimization Letters, 8(4):1213–1224, 2014.

[61] A.M. Geoffrion. Lagrangian relaxation for integer programming, chapter 9, pages 243–

281. Springer Berlin Heidelberg, 2010. in: M. Junger, T.M. Liebling, D. Naddef, G.L.

Nemhauser, W.R. Pulleyblank, G. Reinelt, G. Rinaldi, L.A. Wolsey (Eds.) 50 Years of

Integer Programming 1958-2008, Springer Berlin Heidelberg, 2010, pp. 243-281.

[62] Arthur M. Geoffrion and Richard McBride. Lagrangean relaxation applied to capacitated

facility location problems. AIIE Trans, 10(1):40–47, 1978.

[63] Fred Glover. Future paths for integer programming and links to artificial intelligence.

Computers & Operations Research, 13(5):533 – 549, 1986. Applications of Integer Pro-

gramming.

BIBLIOGRAPHY 150

[64] Nalan Glpinar, Kabir Katata, and Dessislava A Pachamanova. Robust portfolio allocation

under discrete asset choice constraints. Journal of Asset Management, 12:67 – 83, 2011.

[65] Noam Goldberg and Sven Leyffer. An active-set method for second-order conic-

constrained quadratic programming. SIAM Journal on Optimization, 25(3):1455–1477,

2015.

[66] D. Goldfarb and G. Iyengar. Robust portfolio selection problems. Mathematics of Oper-

ations Research, 28(1):1–38, 2003.

[67] Donald Goldfarb. The simplex method for conic programming. Technical report, CORC,

Industrial Engineering and Operations Research, Columbia University, 2002.

[68] Michael Grant and Stephen Boyd. CVX: Matlab software for disciplined convex program-

ming, version 2.0 beta. http://cvxr.com/cvx, 2013.

[69] Martin J. Gruber. Another puzzle: The growth in actively managed mutual funds. The

Journal of Finance, 51(3):783–810, 1996.

[70] Nalan Gulpinar, Berc Rustem, and Reuben Settergren. Simulation and optimization

approaches to scenario tree generation. Journal of Economic Dynamics and Control,

28(7):1291–1315, 2004.

[71] Inc. Gurobi Optimization. Gurobi optimizer reference manual, 2015.

[72] Nils H. Hakansson. Multi-period mean-variance analysis: Toward a general theory of

portfolio choice. The Journal of Finance, 26(4):857–884, 1971.

[73] Lars Peter Hansen and Kenneth J. Singleton. Generalized instrumental variables estima-

tion of nonlinear rational expectations models. Econometrica, 50(5):1269–1286, 1982.

[74] P. Hansen and N. Mladenovic. Variable neighborhood search for the p-median. Location

Science, 5(4):207 – 226, 1997.

[75] Pierre Hansen and Nenad Mladenovic. Variable neighborhood search: Principles and

applications. European Journal of Operational Research, 130(3):449 – 467, 2001.

http://cvxr.com/cvx

BIBLIOGRAPHY 151

[76] Holger Heitsch and Werner Romisch. Scenario reduction algorithms in stochastic pro-

gramming. Computational Optimization and Applications, 24(2-3):187–206, 2003.

[77] Thorkell Helgason and Stein W. Wallace. Approximate scenario solutions in the progres-

sive hedging algorithm - a numerical study with an application to fisheries management.

Annals of Operations Research, 31:425–444, December 1991.

[78] Christoph Helmberg, Franz Rendl, Robert J. Vanderbei, and Henry Wolkowicz. An

interior-point method for semidefinite programming. SIAM Journal on Optimization,

6(2):342–361, 1996.

[79] John H. Holland. Adaptation in Natural and Artificial Systems: An Introductory Analysis

with Applications to Biology, Control and Artificial Intelligence. MIT Press, Cambridge,

MA, USA, 1992.

[80] Kaj Holmberg and Di Yuan. A lagrangian heuristic based branch-and-bound approach

for the capacitated network design problem. Oper. Res., 48(3):461–481, May 2000.

[81] C.M. Hosage and M.F. Goodchild. Discrete space location-allocation solutions from ge-

netic algorithms. Annals of Operations Research, 6(2):35–46, 1986.

[82] Kjetil Hoyland and Stein W. Wallace. Generating scenario trees for multistage decision

problems. Management Science, 47(2):295–307, 2001.

[83] Roel Jansen and Ronald van Dijk. Optimal benchmark tracking with small portfolios.

Journal of Portfolio Management, 28(2):33 – 39, 2002.

[84] N. J. Jobst, M. D. Horniman, C. A. Lucas, and G. Mitra. Computational aspects of

alternative portfolio selection models in the presence of discrete asset choice constraints.

Quantitative Finance, 1(5):489–501, 2001.

[85] Philippe Jorion. Enhanced index funds and tracking error optimization. March 2002.

[86] Philippe Jorion. Portfolio optimization with tracking error constraints. Financial Analysts

Journal, 59(5):70–82, 2003.

BIBLIOGRAPHY 152

[87] Denis Karlow and Peter Rossbach. A method for robust index tracking. In Bo Hu, Karl

Morasch, Stefan Pickl, and Markus Siegle, editors, Operations Research Proceedings 2010,

Operations Research Proceedings, pages 9–14. Springer Berlin Heidelberg, 2011.

[88] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing.

Science, 220(4598):671–680, 1983.

[89] Pieter Klaassen. Comment on ”generating scenario trees for multistage decision problem-

s”. Management Science, 48(11):pp. 1512–1516, 2002.

[90] John G. Klincewicz and Hanan Luss. Lagrangian relaxation heuristic for capacitated fa-

cility location with single-source constraints. Journal of the Operational Research Society,

37(5):495–500, 1986.

[91] Masakazu Kojima, Susumu Shindoh, and Shinji Hara. Interior-point methods for the

monotone semidefinite linear complementarity problem in symmetric matrices. SIAM

Journal on Optimization, 7(1):86–125, 1997.

[92] Fiona Kolbert and Laurence Wormald. Robust portfolio optimization using second-order

cone programming, 2010.

[93] Hiroshi Konno and Katsunari Kobayashi. An integrated stock-bond portfolio optimiza-

tion model. Journal of Economic Dynamics and Control, 21(8-9):1427 – 1444, 1997.

Computational financial modelling.

[94] Hiroshi Konno and Hiroaki Yamazaki. Mean-absolute deviation portfolio optimization

model and its applications to tokyo stock market. Manage. Sci., 37(5):519–531, May

1991.

[95] Roy Kouwenberg. Scenario generation and stochastic programming models for asset lia-

bility management. European Journal of Operational Research, 134(2):279–292, 2001.

[96] Alfred A. Kuehn and Michael J. Hamburger. A heuristic program for locating warehouses.

Management Science, 9(4):643–666, 1963.

[97] Yu-Ju Kuo and Hans D. Mittelmann. Interior point methods for second-order cone pro-

gramming and or applications. Comput. Optim. Appl., 28(3):255–285, September 2004.

BIBLIOGRAPHY 153

[98] Roy H. Kwon and Dexiang Wu. Factor-based robust index tracking. Journal of Opti-

mization and Engineering, April 2016. Online available.

[99] A. H. Land and A. G Doig. An automatic method of solving discrete programming

problems. Econometrica, 28(3):497–520, 1960.

[100] Miguel A. Lejeune and Gulay Samatlı-Pac. Construction of risk-averse enhanced index

funds. INFORMS J. on Computing, 25(4):701–719, October 2013.

[101] John Lintner. The valuation of risk assets and the selection of risky investments in stock

portfolios and capital budgets. The Review of Economics and Statistics, 47(1):13–37,

1965.

[102] Miguel Sousa Lobo, Lieven Vandenberghe, Stephen Boyd, and Herv Lebret. Applications

of second-order cone programming. Linear Algebra and its Applications, 284(1-3):193 –

228, 1998. International Linear Algebra Society (ILAS) Symposium on Fast Algorithms

for Control, Signals and Image Processing.

[103] Arne Lokketangen and DavidL. Woodruff. Progressive hedging and tabu search applied to

mixed integer (0,1) multistage stochastic programming. Journal of Heuristics, 2(2):111–

128, 1996.

[104] D.G. Luenberger. Investment Science. Oxford University Press, Incorporated, 1998.

[105] Philip M. Lurie and Matthew S. Goldberg. An approximate method for sampling cor-

related random variables from partially-specified distributions. Management Science,

44(2):203–218, 1998.

[106] Burton G. Malkiel. Returns from investing in equity mutual funds 1971 to 1991. The

Journal of Finance, 50(2):pp. 549–572, 1995.

[107] Harry Markowitz. Portfolio selection. The Journal of Finance, 7(1):77–91, 1952.

[108] R. O. Michaud. The Markowitz Optimization Enigma: Is Optimized Optimal. Financial

Analysts Journal, 1989.

BIBLIOGRAPHY 154

[109] Ryuhei Miyashiro and Yuichi Takano. Mixed integer second-order cone programming

formulations for variable selection. Technical report, Tokyo Institute of Technology, 2013.

[110] Renato D. C. Monteiro. Primal–dual path-following algorithms for semidefinite program-

ming. SIAM Journal on Optimization, 7(3):663–678, 1997.

[111] Renato D. C. Monteiro. Polynomial convergence of primal-dual algorithms for semidefinite

programming based on the monteiro and zhang family of directions. SIAM Journal on

Optimization, 8(3):797–812, 1998.

[112] Renato D.C. Monteiro and Takashi Tsuchiya. Polynomial convergence of primal-dual al-

gorithms for the second-order cone program based on the mz-family of directions. Math-

ematical Programming, 88(1):61–83, 2000.

[113] ApS MOSEK. The MOSEK optimization toolbox for MATLAB manual. Version 7.1

(Revision 28)., 2015.

[114] Jan Mossin. Equilibrium in a capital asset market. Econometrica, 34(4):768–783, 1966.

[115] John M. Mulvey and Hercules Vladimirou. Stochastic network programming for financial

planning problems. Management Science, 38(11):pp. 1642–1664, 1992.

[116] M.G. Narciso and L.A.N. Lorena. Lagrangean/surrogate relaxation for generalized as-

signment problems. European Journal of Operational Research, 114(1):167–177, 1999.

[117] Yu. E. Nesterov and M. J. Todd. Self-scaled barriers and interior-point methods for

convex programming. Mathematics of Operations Research, 22(1):pp. 1–42, 1997.

[118] Yu. E. Nesterov and M. J. Todd. Primal-dual interior-point methods for self-scaled cones.

SIAM Journal on Optimization, 8(2):324–364, 1998.

[119] Kyong Joo Oh, Tae Yoon Kim, and Sungky Min. Using genetic algorithm to support

portfolio optimization for index fund management. Expert Systems with Applications,

28(2):371 – 379, 2005.

BIBLIOGRAPHY 155

[120] S.O. Orero and M.R. Irving. A genetic algorithm for generator scheduling in power

systems. International Journal of Electrical Power & Energy Systems, 18(1):19 – 26,

1996.

[121] PanosM. Pardalos and StephenA. Vavasis. Quadratic programming with one negative

eigenvalue is np-hard. Journal of Global Optimization, 1(1):15–22, 1991.

[122] G.Ch. Pflug. Scenario tree generation for multiperiod financial optimization by optimal

discretization. Mathematical Programming, 89(2):251–271, 2001.

[123] S. Poljak, F. Rendl, and H. Wolkowicz. A recipe for semidefinite relaxation for (0,1)-

quadratic programming. Journal of Global Optimization, 7(1):51–73, 1995.

[124] A.B. Poore and A.J. Robertson III. A new lagrangian relaxation based algorithm for a

class of multidimensional assignment problems. Computational Optimization and Appli-

cations, 8(2):129–150, 1997.

[125] R. T. Rockafellar and Roger J.-B. Wets. Scenarios and policy aggregation in optimization

under uncertainty. Mathematics of Operations Research, 16(1):119–147, 1991.

[126] R. Tyrrell Rockafellar and Stanislav Uryasev. Optimization of conditional value-at-risk.

Journal of Risk, 2:21–41, 2000.

[127] Erik Rolland, David A. Schilling, and John R. Current. An efficient tabu search procedure

for the p-median problem. European Journal of Operational Research, 96(2):329 – 342,

1997.

[128] V. Roshanaei, B. Naderi, F. Jolai, and M. Khalili. A variable neighborhood search for job

shop scheduling with set-up times to minimize makespan. Future Generation Computer

Systems, 25(6):654 – 661, 2009.

[129] Ruben Ruiz-Torrubiano and Alberto Suarez. A hybrid optimization approach to index

tracking. Annals OR, 166(1):57–71, 2009.

[130] Andrzej Ruszczynski. Decomposition methods. In A. Ruszczynski and A. Shapiro, editors,

Stochastic Programming, volume 10 of Handbooks in Operations Research and Manage-

ment Science, pages 141 – 211. Elsevier, 2003.

BIBLIOGRAPHY 156

[131] Seyed Jafar Sadjadi, Mohsen Gharakhani, and Ehram Safari. Robust optimization frame-

work for cardinality constrained portfolio problem. Appl. Soft Comput., 12(1):91–99,

January 2012.

[132] Robert J. Shiller Sanford J. Grossman. The determinants of the variability of stock market

prices. The American Economic Review, 71(2):222–227, 1981.

[133] S. H. Schmieta and F. Alizadeh. Associative and jordan algebras, and polynomial time

interior-point algorithms for symmetric cones. Mathematics of Operations Research,

26(3):pp. 543–564, 2001.

[134] W.F. Sharpe. The sharpe ratio. Journal of Portfolio Management, 21:49 – 58, 1994.

[135] William F. Sharpe. Capital asset prices: A theory of market equilibrium under conditions

of risk. The Journal of Finance, 19(3):pp. 425–442, 1964.

[136] D.X. Shaw, S. Liu, and L. Kopman. Lagrangian relaxation procedure for cardinality-

constrained portfolio optimization. Optimization Methods and Software, 23(3):411–420,

2008.

[137] Hanif D. Sherali and Warren P. Adams. A hierarchy of relaxations between the continuous

and convex hull representations for zero-one programming problems. SIAM Journal on

Discrete Mathematics, 3(3):411–430, 1990.

[138] Ralph E. Steuer, Yue Qi, and Markus Hirschberger. Comparative issues in large-scale

mean-variance efficient frontier computation. Decision Support Systems, 51(2):250 – 255,

2011.

[139] Stephen Stoyan and Roy Kwon. A two-stage stochastic mixed-integer programming ap-

proach to the index tracking problem. Optimization and Engineering, 11:247–275, 2010.

[140] Jos F. Sturm. Using sedumi 1.02, a matlab toolbox for optimization over symmetric

cones. Optimization Methods and Software, 11(1-4):625–653, 1999.

[141] C. Tadonki and J.-Ph. Vial. Portfolio selection with cardinality constraints. Technical

report, Switzerland, 2003.

BIBLIOGRAPHY 157

[142] S. Takriti, J.R. Birge, and E. Long. A stochastic model for the unit commitment problem.

Power Systems, IEEE Transactions on, 11(3):1497–1508, Aug 1996.

[143] Takashi Tsuchiya. A convergence analysis of the scaling-invariant primal-dual path-

following algorithms for second-order cone programming. Optim. Methods Softw, 11:141–

182, 1998.

[144] R.H. Tutuncu and M. Koenig. Robust asset allocation. Annals of Operations Research,

132(1-4):157–187, 2004.

[145] FernandoBadilla Veliz, Jean-Paul Watson, Andres Weintraub, RogerJ.-B. Wets, and

DavidL. Woodruff. Stochastic optimization models in forest planning: a progressive hedg-

ing solution approach. Annals of Operations Research, pages 1–16, 2011.

[146] Juan Pablo Vielma, Shabbir Ahmed, and George L. Nemhauser. A lifted linear pro-

gramming branch-and-bound algorithm for mixed-integer conic quadratic programs. IN-

FORMS J. on Computing, 20(3):438–450, July 2008.

[147] Jean-Paul Watson and DavidL. Woodruff. Progressive hedging innovations for a class

of stochastic mixed-integer resource allocation problems. Computational Management

Science, 8(4):355–370, 2011.

[148] Philip Wolfe. The simplex method for quadratic programming. Econometrica, 27(3):pp.

382–398, 1959.

[149] P. Xidonas, D. Askounis, J. Psarras, and Mavrotas G. Portfolio engineering using the

ipssis multiobjective optimisation decision support system. International journal of deci-

sion sciences, risk and management, 1(1/2):36–53, 2009.

[150] Wotao Yin. Gurobi mex: A matlab interface for gurobi, 2009 - 2011.

[151] Stavros A. Zenios. Practical Financial Optimization: Decision Making for Financial

Engineers. Wiley, 2007. pp. 177 - 189.

[152] W.T. Ziemba. The stochastic programming approach to asset, liability, and wealth man-

agement. Research Foundation of AIMR, Scorpion Publications, 2003.

List of Publications

� Part of Chapter 6 of this thesis is published in the Journal of Optimization and Engineer-

ing, with the reference of ”Roy H. Kwon and Dexiang Wu.” Factor-based robust index

tracking. Journal of Optimization and Engineering, April 2016. Online available.

� Chapter 4 of this thesis is submitted to The European Journal of Operational Research.

158

Appendix A

Appendix of Chapter 4

A. 1 Numerical example for Heuristic I

To quickly generate an initial feasible solution, a numerical example based on S&P500 is used to illustrate

the Heuristic I as follows:

Set q = 10, α = 0.001, γ = 0.5.

Sector size vector m =(

82 40 37 81 50 61 71 30 13 35)T .

(0) After sorting the marker value and choosing the first q assets, we obtain:

qk =(

0 2 1 3 2 1 1 0 0 0)T

and associated∑|K|i=1 pk −

γα = −48.2539 ≤ 0, and L (qk) = 151.1024.

(1) I1 ={

4 2 5}

, qI1 ={

3 2 2}

I2 ={

7 6 3}

, qI2 ={

1 1 1}

, qI2 equal each other, sort I2 by mI2 ={

71 61 37}

.

I3 ={

1 10 8 9}

, mI2 ={

82 35 30 13}

P = ∅, N = 100, by 1O - 4O:

1O Pick 2 assets from sector 4 (A = 1) in I1, add to sector 7 (B = 1) in I2, then new pt1:

qfesik =(

0 2 1 1 2 1 3 0 0 0)T , L

(qfesik

)= 146.3845.

2O Pick 2 assets from sector 4 (A = 1) in I1, add to sector 1 (B = 1) in I3, then new pt2:

qfesik =(

2 2 1 1 2 1 1 0 0 0)T , L

(qfesik

)= 171.7848.

3O Pick 2 assets from sector 4 (A = 1) in I1, add 1 asset to sector 7 (B = 1) in I2

and 1 asset to sector 1 (C = 1) in I3, then new pt3:

qfesik =(

1 2 1 1 2 1 2 0 0 0)T , L

(qfesik

)= 170.8433.

4O Pick 1 asset from sector 4 (A = 1) in I1 and 1 asset from sector 7 (B = 1) in I2,

add them to sector 1 (C = 1) in I3, then new pt4:

qfesik =(

2 2 1 2 2 1 0 0 0 0)T , L

(qfesik

)= 150.2912.

(2) Solve (L) without constraint∑|K|k=1 pk ≤

γα under qfesik vectors in P ;

159

Appendix A. Appendix of Chapter 4 160

(3) Test transaction cost constraint (TC); Obj = ∅L(qfesik

)by 2O and 3O are better than L (qk), and both solutions

satisfy the TC. STOP.

Then we can generate the initial qfesik =(

2 2 1 1 2 1 1 0 0 0)T , i.e. we add 1 assets

to sector 1 from sector 4 and add 1 asset to sector 1 from sector 7. Then the associated∑|K|i=1 pk −

γα =

−47.3689. The objective value under qfesik is 171.7484, which is better than the value from Step (0)

and the constraint∑|K|i=1 pk ≤

γα still be satisfied. The initial feasible objective is lower than that value

(200.0432) by LR method at q=10 on the first sub-figure of Figure (4.4).

A. 2 Numerical example for Heuristic II

Sector size vector m =(

82 40 37 81 50 61 71 30 13 35)T . Set q = 90, α = 0.001,

γ = 0.5. From Heuristic I we get an initial feasible qfesik =(

82 0 0 8 0 0 0 0 0 0)T that

across sectors, and its feasible objective 141.8202. Then a Lagragian vector is obtained by solving (L)

under qLRk =(

4 5 3 6 4 9 3 6 3 2)T , but associated

∑|K|k=1 q

LRk = 45 < q = 90, so we go

to Heuristic II to adjust qLRk and get a feasible solution.

(1) Since∑|K|k=1 q

LRk < q, we adjust qLRk as follows:

Pick kth sector in qLRk that has minimal value, i.e. k = 10, qLR10 = 2

Check if qLRk ≤ m (k), i.e. qLR10 = 2 < m (10) = 35, then qfesik = qLRk , i.e. qfesi10 = 2

Else if qLRk > m (k), qfesik = m (k)

Repeat above steps, after check all sectors∑|K|k=1 q

fesik = 45 < q = 90; then put the difference

q −∑|K|i=1 q

fesik = 45 into the sector have the maximal asset number, i.e. sector 1.

We obtain a qfesik vector as follows by Step (1):m

qLRk

qfesik

=

82 40 37 81 50 61 71 30 13 35

4 5 3 6 4 9 3 6 3 2

49 5 3 6 4 9 3 6 3 2

, go to Step (2)

(2) Solve (L) without∑|K|k=1 pk ≤

γα ,∑|K|

k=1 pk −γα = 44 > 0, GO TO (3)

(3) Set |4| = 2, For k = 1, do:

I1 ={w0j1|j ∈

{qfesi1

}}, sort I1, i.e. market weights of 49 assets ascently

I2 ={w0j1|j ∈ {m (1)} \

{qfesi1

}}, sort I2, i.e. market weights of 33 assets, descently;

Pick first one number of assets in I1 and I2, switch and obtain new neighbor point

Pick first two number of assets in I1 and I2, switch and obtain new neighbor point

For k = 2 : 10, do the same above swap steps.

Then we totally generate 20 new qfesik in Step (3), and we test all new pts for TC,

there is no pts satisfy∑|K|k=1 pk ≤

γα , go to Step (4)


(4) Pick sector 1 (k1) and sector 10 (k2)

I1 ={w0j1|j ∈ {m (1)}

}, sort I1, i.e. market weights of 82 assets, ascently

I2 ={w0j,10|j ∈ {m (10)}

}, sort I2, i.e. market weights of 35 assets, descently;

Pick first 2 number of assets in I1 and I2,

Set {y1:2,1 = y1:2,10|j ∈ 4,4 ∈ I1} and {y1:2,10 = y1:2,1|j ∈ 4,4 ∈ I2}Obtain a new qk vector

Pick sector 1 (k1) and sector 3 (k2)

I1 ={w0j1|j ∈ {m (1)}

}, sort I1, i.e. market weights of 82 assets, ascently

I2 ={w0j3|j ∈ {m (3)}

}, sort I2, i.e. market weights of 37 assets, descently;

Pick first 2 number of assets in I1 and I2,

Set {y1:2,1 = y1:2,3|j ∈ 4,4 ∈ I1} and {y1:2,3 = y1:2,1|j ∈ 4,4 ∈ I2}Obtain a new qk vector

· · ·Repeat above steps and we can get 120 new qk vectors. Then some vectors cannot

maintain q −∑|K|k=1 q

fesik = 0, therefore go to Step (1) to adjust the qk vectors.

We test all pts and obtain a better solution with qfesik =(

45 6 2 7 11 4 7 4 2 2)T ,

and satisfy TC∑|K|k=1 pk−

γα = −48.4363 ≤ 0. Then we get a feasible objective 287.2393, which is higher

than the initial objective value 141.8202 by Heuristic I.


A. 3 Ticker in S&P500

Table A.1: Ticker symbol across Sectors (SP500)

Sector

(total number)Ticker Symbol

1: Consumer

Discretionary

(82)

ANF, AMZN, APOL, AN, AZO, BEAM, BBBY, BBY, BIG, HRB, BWA, CVC,

KMX, CCL, CBS, COH, CMCSA, DHI, DRI, DV, DTV, DISCA, DLTR,

EXPE, FDO, F, GME, GCI, GPS, GPC, GT, HOG, HAR, HAS, HD, IGT,

IPG, JCI, KSS, LEG, LEN, LTD, LOW, M, MAR, MAT, MCD, MHP, NWL,

NWSA, NKE, JWN, CMG, ORLY, OMC, JCP, RL, PHM, ROST, SNI, SHLD,

SHW, SNA, SWK, SPLS, SBUX, HOT, TGT, TIF, TWX, TWC, TJX, TRIP,

URBN, VFC, VIAB, DIS, WPO, WHR, WYN, WYNN, YUM

2: Consumer

Staples

(41)

MO, ADM, AVP, BFB, CPB, CLX, KO, CCE, CL, CAG, STZ, COST, CVS,

DF, DPS, EL, GIS, HNZ, HRL, K, KMB, KFT, KR, LO, MKC, MJN,

TAP, PEP, PM, PG, RAI, SWY, SLE, SJM, SVU, SYY, HSY, TSN, WMT,

WAG, WFM

3: Energy

(41)

APC, APA, BHI, COG, CAM, CHK, CVX, COP, CNX, DNR, DVN, DO, EP,

EOG, XOM, FTI, HAL, HP, HES, MRO, MPC, ANR, MUR, NBR, NOV,

NFX, NE, NBL, OXY, BTU, PXD, RRC, RDC, SLB, SWN, SE, SUN, TSO,

VLO, WMB, WPX

4: Financials

(81)

ACE, AFL, ALL, AXP, AIG, AMP, AON, AIV, AIZ, AVB, BAC, BK,

BBT, BRK.B, BLK, BXP, COF, CBG, SCHW, CB, CINF, C, CME, CMA,

DFS, ETFC, EFX, EQR, FII, FITB, FHN, BEN, GNW, GS, HIG, HCP,

HCN, HST, HCBK, HBAN, ICE, IVZ, JPM, KEY, KIM, LM, LUK, LNC,

L, MTB, MMC, MET, MCO, MS, NDAQ, NTRS, NYX, PBCT, PCL, PNC,

PFG, PGR, PLD, PRU, PSA, RF, SPG, SLM, STT, STI, TROW, TRV,

TMK, USB, UNM, VTR, VNO, WFC, WY, XL, ZION

5: Health Care

(51)

ABT, AET, AGN, ABC, AMGN, BCR, BAX, BDX, BIIB, BSX, BMY, CAH,

CFN, CELG, CERN, CI, CVH, COV, DVA, XRAY, EW, ESRX, FRX, GILD,

HSP, HUM, ISRG, JNJ, LH, LIFE, LLY, MCK, MHS, MDT, MRK, MYL,

PDCO, PKI, PRGO, PFE, DGX, STJ, SYK, THC, TMO, UNH, VAR, WAT,

WPI, WLP, ZMH

6: Industrials

(62)

MMM, APH, AVY, BA, CHRW, CAT, CTAS, GLW, CSX, CMI, DHR, DE,

RRD, DOV, DNB, ETN, EMR, EXPD, FAST, FDX, FSLR, FLS, FLR, GD,

GE, GR, GWW, HON, ITW, IRM, XYL, JEC, CBE, JOY, LLL, LMT, MAS,

NSC, NOC, PCAR, IR, PLL, PH, PBI, PCP, PCLN, PWR, RTN, RSG,

RHI, ROK, COL, ROP, R, LUV, SRCL, TXT, TYC, UNP, UPS, UTX, WM

Continued on next page


Table A.1 – continued from previous page

Sector

(total number)Ticker Symbol

7: Information

Technology (70)

ACN, ADBE, AMD, A, AKAM, ALTR, ADI, AAPL, AMAT, ADSK, ADP,

BMC, BRCM, CA, CSCO, CTXS, CTSH, CSC, DELL, EBAY, EA, EMC,

FFIV, FIS, FISV, FLIR, GOOG, HRS, HPQ, INTC, IBM, INTU, JBL,

JDSU, JNPR, KLAC, LXK, LLTC, LSI, MA, MCHP, MU, MSFT, MOLX,

MMI, MSI, NTAP, NFLX, NVLS, NVDA, ORCL, PAYX, QCOM, RHT, SAI,

CRM, SNDK, SYMC, TEL, TDC, TER, TXN, TSS, VRSN, V, WDC, WU,

XRX, XLNX, YHOO

8: Materials

(29)

APD, ARG, AA, ATI, BLL, BMS, CF, CLF, DOW, DD, EMN, ECL, FMC,

FCX, IFF, IP, MWV, MON, MOS, NEM, NUE, OI, PPG, PX, SEE, SIAL,

TIE, X, VMC

9: Telecommu-

nications

Services (8)

AMT, T, CTL, FTR, PCS, S, VZ, WIN

10: Utilities

(35)

AES, GAS, AEE, AEP, CNP, CMS, ED, CEG, D, DTE, DUK, EIX, ETR,

EQT, EXC, FE, TEG, NEE, NI, NU, NRG, OKE, POM, PCG, PNW, PPL,

PGN, PEG, QEP, SCG, SRE, SO, TE, WEC, XEL


A. 4 Gap by LR and SLR

Table A.2: Gap between LB and UB, 2006-2007

qBest

feasi. LBUB

by LRGap(%)

Time(hour)

Bestfeasi. LB

UBby SLR

Gap(%)

Time(hour)

10 200.1411 200.1411 0.00 2.0793 199.9768 199.9879 0.01 1.567720 225.8560 228.8478 1.31 1.9232 225.9124 225.9124 0.00 1.351030 238.7528 238.8478 0.04 1.7302 238.6879 238.6879 0.00 1.370340 248.5355 248.5355 0.00 2.5041 248.5752 248.5752 0.00 1.520450 257.3933 257.4136 0.01 2.8302 257.4995 257.4995 0.00 1.588860 265.9282 265.9282 0.00 2.1591 265.9758 265.9758 0.00 1.587970 273.8690 273.8981 0.01 2.5669 273.8855 273.8855 0.00 1.659180 281.4972 288.3209 2.37 2.1303 281.4191 281.4191 0.00 1.731390 288.7579 288.7579 0.00 2.5654 288.7281 288.7281 0.00 1.6012100 295.7827 295.7961 0.00 2.8282 295.8098 295.8614 0.02 1.5912110 302.2690 302.4966 0.08 2.4198 302.7136 302.7140 0.00 1.5558120 309.1766 309.1808 0.00 3.0207 309.4017 309.4496 0.02 1.8687130 315.6966 315.7014 0.00 2.3521 315.9795 315.9795 0.00 1.4031140 322.0628 322.0628 0.00 2.5294 322.0498 322.3851 0.10 1.5410150 328.2567 328.2567 0.00 2.7146 328.4784 328.5663 0.03 1.6440160 334.3379 334.3382 0.00 2.7587 334.0301 334.6575 0.19 1.7120170 340.2321 340.2342 0.00 2.8714 340.0902 340.5351 0.13 1.8069180 345.9373 345.9386 0.00 2.8517 346.1944 346.1944 0.00 1.6234190 351.5182 351.5352 0.00 2.5189 343.7173 353.2373 2.70 1.8248200 356.8954 356.8981 0.00 2.7137 356.8413 356.8577 0.00 1.7710210 362.0499 362.0499 0.00 2.7986 362.0377 362.0391 0.00 1.7572220 366.9886 366.9935 0.00 2.6691 364.3428 372.9470 2.31 1.5818230 371.6848 371.7128 0.01 3.0692 369.8994 382.0756 3.19 1.5815240 376.2733 376.2822 0.00 2.5227 373.4730 381.5997 2.13 1.5890250 380.5282 380.6464 0.03 3.1267 379.0441 387.8023 2.26 1.5943260 384.8675 384.8736 0.00 3.2349 376.0532 394.1475 4.59 1.9700270 388.8385 388.8402 0.00 3.0260 388.9873 389.1964 0.05 1.8977280 392.6102 392.6193 0.00 3.0105 386.8865 396.2528 2.36 2.0435290 396.1669 396.1669 0.00 3.2281 396.3071 396.6762 0.09 2.1802300 399.4385 399.4390 0.00 2.8431 399.4186 399.7567 0.08 2.1541310 402.7803 402.7996 0.00 2.9404 402.8073 402.8073 0.00 2.0810320 405.4231 405.4231 0.00 3.1733 405.4203 405.4376 0.00 2.0304330 407.5927 407.5927 0.00 2.6027 407.5992 407.6025 0.00 1.6447340 409.2347 409.2778 0.01 2.8883 409.2638 409.3073 0.01 1.7358350 410.5592 410.6025 0.01 3.5055 410.5736 410.5829 0.00 2.2411

Aver. 355.6949 356.0136 0.12 2.8699 354.7903 357.1315 0.61 1.8304

A. 5 Sector Allocation


Figure A.1: Portfolio allocation in sectors

Appendix B


B. 1 The pseudocode for LR sub-solver

Lagrangian Relaxation for sub-problem


ν ←− 0, LBD ←− −∞, UBD ←−∞,ωs−,vi ←− 0, ωs−,vi ←− 1,∀i ∈ N ,∀s ∈ S

θs−,vij ←− 0, θs+,vij ←− 1,∀ (i, j) ∈ A1s,∀s ∈ S

Step 1: (Solve primal problem)

Solve sub LR (xv, gv, yv) under fixed (ωv, θv),

Update LBD ←− max (LBD, sub LR (xv, gv, yv))

If (xv, gv, yv) is feasible to constraint (5.31) - (5.36),

Update UBD ←− min (UBD, sub LR (xv, gv, yv)). STOP.

Else if (xv, gv, yv) is infeasible to constraint (5.31) - (5.36),

Find a feasible solution(xvadj , y

vadj

)in model (5.40) - (5.42) under gvadj

from model (5.37) - (5.39), and calculate UBDvadj to model (5.30) - (5.36).

Update UBD ←− min(UBD,UBDv

adj

). GO TO Step 2.

Step 2: (Solve dual problem)

Maximize sub LR (ωv, θv) under given (xv, gv, yv) by following criteria:

If vth coefficient of (ωv, θv) is positive,

then increase the Lagrangian by increasing the component of (ωv, θv).

If vth coefficient of (ωv, θv) is negative,

then increase the Lagrangian by decreasing the component of (ωv, θv).

Step 3: (Lagrangian multiplier update)

166

Appendix B. Appendix of Chapter 5 167

Lagrangian update by searching tv so that sub LR(ωv+1, θv+1

)> sub LR (ωv, θv)

ωv+1i ←− max (0, ωvi + tvi d

vi )

θv+1ij ←− max

(0, θvij + tvijd

vij

)Step 4: (Stop criteria)

Calculate Gapv = (UBD − LBD) / |UBD|If Gapv > ε, v = v + 1. GO TO Step 1.

How to determine the step size tvi and tvij? To illustrate the problem, let’s simplify the Lagrangian

maxω minx LR (xv, ωv) = cTxv + (ωv)T

(Bxv − b). We know that dv = Bxv − b is the gradient to

Lagrangian function at xv, suppose ωv+1 = ωv + tv ∗ dv, then

LR(xv, ωv+1

)= cTxv +

(ωv+1

)T(Bxv − b)

= cTxv + (ωv)T

(Bxv − b) + tv (dv)T

(Bxv − b)

= cTxv + (ωv)T

(Bxv − b) + tv (Bxv − b)T (Bxv − b)

= LR (xv, ωv) + tv (Bxv − b)T (Bxv − b)

= LR (xv, ωv) + tv ‖Bxv − b‖2

=⇒ tv =LR

(xv, ωv+1

)− LR (xv, ωv)

‖Bxv − b‖2=BestUB − CurrentLB

‖Bxv − b‖2

In practice we set tv = α(BestUB−CurrentLB)

‖Bxv−b‖2 where α > 1, if LR(xv, ωv+1

)≤ LR (xv, ωv), then

α = .5α and research the lower bound until LR(xv, ωv+1

)> LR (xv, ωv).

Table B.1: LR method and Gurobi Comparison - instance 2Scenario

Sub-problemGurobi LR Method

Gap toGurobi

N=100,K=10S=15

Best LB Feasi. UB GapTime(s)


col (7-3)./col 3

s=1 -94037.8 -26920.2 249.32% 2643 -112054.5 -26895.2 316.63% 423 0.09%

s=2 -94174.9 -25623.0 267.54% 1870 -111017.8 -25434.4 336.49% 416 0.74%

s=3 -93122.4 -24293.2 283.33% 2013 -109820.8 -24049.2 356.65% 556 1.00%

s=4 -91377.2 -22823.6 300.36% 1902 -109298.1 -22861.8 378.08% 505 -0.17%

s=5 -91084.8 -21764.9 318.49% 1855 -108722.3 -21575.6 403.91% 393 0.87%

s=6 -91952.3 -20714.0 343.91% 2552 -111220.5 -20551.9 441.17% 445 0.78%

s=7 -92664.5 -19243.6 381.53% 2133 -110551.9 -19276.3 473.51% 444 -0.17%

s=8 -91226.7 -18556.6 391.61% 1904 -110128.4 -18246.4 503.56% 420 1.67%

s=9 -90282.0 -18225.8 395.35% 2038 -110402.5 -18034.8 512.16% 493 1.05%

s=10 -87883.1 -18019.5 387.71% 1836 -111385.6 -17735.8 528.03% 445 1.57%

s=11 -91548.9 -18256.8 401.45% 1874 -115057.8 -17983.2 539.81% 534 1.50%

s=12 -91120.0 -17982.5 406.71% 1816 -115006.6 -17802.8 546.00% 552 1.00%

s=13 -88655.2 -17900.7 395.26% 1860 -114895.2 -17755.1 547.11% 549 0.81%

s=14 -89874.3 -17787.0 405.28% 1765 -114889.0 -17529.9 555.39% 872 1.45%

s=15 -89865.8 -17628.6 409.77% 1730 -114847.8 -17400.7 560.02% 570 1.29%

Average - - 355.84% 1986 - - 466.57% 508 0.90%




Gap toGurobi

N=100,K=10S=3



col (7-3)./col 3

s=1 -94443.7 -26705.9 253.64% 2207.07 -112376.8 -26610.6 322.30% 461.72 0.36%

s=2 -91835.2 -18599.6 393.75% 1906.28 -110115 -18378.5 499.15% 451.84 1.19%

s=3 -87422 -17643.6 395.49% 1655 -114860.7 -17459 557.89% 522.84 1.05%

Average - - 347.63% 1922.78 - - 459.78% 478.8 0.86%



Gap toGurobi

N=300,K=30S=10



col (7-3)./col 3

s=1 GUROBI ERROR: Out of memory -917249.7 -134995.9 579.46% 18117 -

From Table B.1 and B.2, we see the solution of LR is close to the solution from Gurobi. Meanwhile,

the solving time by LR method (average around 500 seconds) is far less than the time by Gurobi (average

around 1900 seconds). In some cases the solution of LR is even better than Gurobi, e.g. s = 4 and s =

7 in Table B.1. As node number increase, e.g. N=100 to N=300, Gurobi cannot solve the sub-problem

because of the memory capacity while the LR method still can return a feasible solution (in Table B.3).

B. 2 The pseudocode for Tabu search sub-solver

Tabu Search Heuristic for sub-problem


Generate the initial g by ascent sorting Fi

Ri, and selecting first K assets

from{Fi

Ri|∀i ∈ N

}.

Divide the index set N as N = K ∪ S, where K is the selected index set

and S is the unselected index set.

Solve corresponding Z(x) and Z(y), and get initial (x∗, g, y∗) and objective value Z∗.

Set neighbor point size M , iteration number V , Tabu list length L.

Set v = 0. GO TO Step 1.

Step 1: (Move to the Neighbourhood)


Set m = 1, while m < M do following cases:

Case 1: Search the arc coefficient {Cji|∀j ∈ S, i ∈ K}, and {Cej |∀e ∈ K, j ∈ S}.If any Cji > Cej , swap node j and e. Record all the neighbor points

(x′, g, y′) and Z ′.

Case 2: Search the return {Rj |∀j ∈ S} and {Ri|∀i ∈ K}.If any Rj > Ri, swap node j and i. Record all the neighbor points

(x′, g, y′) and Z ′.

Case 3: Search the arc coefficient {Fj |∀j ∈ S} and {Fi|∀i ∈ K}.If any Fj < Fi, swap node j and i. Record all the neighbor points

(x′, g, y′) and Z ′.

m = m+ 1. GO TO Step 2.

Step 2: (Select the best movement)

Check the corresponding neighbor point in the Tabu list,

If no, update (x∗, g, y∗)←− (x′, g, y′) , Z∗ ←− Z ′.Else select the second best movement and evaluated again.

Step 3: (Tabu list update)

Update the Tabu list if necessary. v = v + 1, If v < V , GO TO Step 1.

B. 3 Speed up solving process for sub-problem

(1) For LR method, we can cut the iteration number until the solution quality is changed. For example,

we decreased the iteration number from 200 to 60, 30, and found that the solution almost keep the same.

The results are shown in Table B.4.

Table B.4: LR under different iteration numberN=100, LR method(iter# = 200) LR method(iter# = 60) LR method(iter# = 30)K=10,S Feasible UB Sol time (s) Feasible UB Sol time (s) Feasible UB Sol time (s)

1 -748046.42 412.88 -748046.42 159.65 -748046.42 104.942 -747921.49 412.64 -747921.49 158.56 -747921.49 104.723 -747768.60 410.09 -747768.60 156.76 -747768.60 102.564 -747475.45 409.16 -747475.45 153.09 -747475.45 100.385 -747726.65 401.57 -747726.65 155.03 -747726.65 97.346 -747583.25 399.44 -747583.25 153.99 -747583.25 99.297 -747439.47 404.42 -747439.47 158.88 -747439.47 104.058 -747281.18 402.92 -747281.18 157.25 -747281.18 104.589 -747690.88 396.00 -747690.88 149.74 -747690.88 96.0410 -747551.29 395.06 -747551.29 149.20 -747551.29 96.01

Aver. -747648.47 404.42 -747648.47 155.21 -747648.47 100.99


(2) For Tabu search method, we adjust three parameters: Tabu list length (L), iteration number,

and the neighbor point (M). Our testing result had shown that larger list length, i.e. L > 5, were

time consuming inefficient, and shorter list length, i.e. L < 5, were solution quality inefficient (see the

first 6 columns in Table B.5). The number of the neighbor points (M) also affects the searching time,

for example, there are 550 possible neighbor points at each iteration by 3 cases in Tabu procedure.

However, if we only chose 3 largest {Rj |∀j ∈ S} for each selected i ∈ K in case 2, and chose 3 smallest

{Fj |∀j ∈ S} for each selected i ∈ K in case 3, then the possible searching points can be shrank to 150

from 550 (column 7 - 10 in Table B.5). Moreover, if we remove case 1 and generate neighbor points by

case 2 and 3 only, then M can be reduced to 50 (see column 11 and 12) from 150. Therefore we can

control the neighbor point set M and without losing the solution quality. We list the testing result in

Table B.5.

Table B.5: Tabu search under different (L, iter number, M)

N=100,Tabu method

(L=7, Iter#=10,M=550)

Tabu method(L=5, Iter#=10,

M=550)


M=550)


M=150)


M=150)


M=50)

K=10,S Feasi. UBTime(s)

Feasi.UBTime(s)

Feasi. UBTime(s)

Feasi. UBTime(s)

Feasi. UBTime(s)

Feasi. UBTime(s)

1 -748066.43 3206.41 -748066.43 2429.13 -718970.49 1154.01 -748066.43 728.92 -718972.22 543.14 -748066.43 220.662 -747934.5 3029.39 -747934.5 2855.75 -718791.28 992.46 -747934.5 740.13 -718791.28 552.11 -747929.9 221.313 -747789.62 3170.54 -747792.28 2943.18 -716048.04 1060.07 -747789.77 671.32 -733194.66 491.05 -747784.95 176.684 -747485.58 4094.06 -747485.58 3717.78 -747485.58 1741.46 -747485.58 700.74 -747485.51 506.78 -747483.92 195.145 -747734.48 4236.21 -747734.48 2641.18 -747734.48 1914.01 -747734.48 689.6 -747734.48 490.4 -747734.48 180.916 -747601.24 3112.67 -747601.24 1944.95 -747601.24 1402.23 -747601.24 725.31 -747601.24 522.98 -747599.89 186.97 -747451.01 2400.52 -747451.01 1486.24 -747451.01 1062.69 -747451.01 644.28 -747451.01 465.04 -747451.01 166.518 -747297.61 2374.4 -747297.61 1449.19 -747297.61 1024.9 -747297.61 633.45 -747297.61 461.97 -747297.61 154.799 -747699.15 2286.1 -747699.15 1403.59 -747699.15 897.14 -747699.15 682.81 -747699.15 492.69 -747699.15 182.1810 -747561.78 2143.73 -747561.78 1261.51 -747561.78 865.02 -747561.78 607.58 -747561.78 438.08 -747561.78 178.45

Aver. -747662.14 3005.4 -747662.41 2213.25 -738664.07 1211.4 -747662.15 682.41 -740378.89 496.43 -747660.91 186.35

From Table B.4, the average running by LR method can be reduced to 100 seconds from 400 seconds.

From Table B.5, we can see that the best Tabu length L equal 5, and the average running time is 180

seconds, which is closed to the LR methods with iteration number equal 60. However, the objective

value of Tabu search is generally better than LR method.

Next we test more randomly cases for LR and Tabu methods, and list the result in Table B.6 - B.10.

The last second column indicates how close between these two methods, negative value means the Tabu

solution is superior to the LR solution and positive value means Tabu solution is worse than LR. The

last column is indicates how quickly the Tabu solutions are than the LR solutions.

We can see that in Table B.6, Tabu method averagely save 10% of the running time and get the

same solution than LR method; in Table B.7, Tabu method averagely save 47.97% of the running time

and get 4.94% better solution than LR method; in Table B.8, Tabu method averagely save 21.18% of the

running time and get 4.47% better solution than LR method; in Table B.9, Tabu method averagely runs

1.98% of the time more and get 10.41% better solution than LR method; in Table B.10, Tabu method

averagely runs 2 times of the time more and get 11.38% better solution than LR method.


Table B.6: LR and Tabu comparison (N=100, K=10, S=15)

N=100,LR method(iter# = 60)

Tabu method(L=5, Iter#=10, M=50)

(UBTabu − UBLR) (TTabu − TLR)

K=10,S Feasi. UB Sol time (s) Feasi. UB Sol time (s) /|UBLR| /|TLR|1 -1079328.8 151.6 -1079333.9 156.4 0.00% 3.18%

2 -1079157.1 153.7 -1079157.1 157.8 0.00% 2.71%

3 -1078954.2 151.9 -1078962.0 165.6 0.00% 9.03%

4 -1078765.2 152.1 -1078769.3 157.4 0.00% 3.44%

5 -1078599.1 150.8 -1078600.2 126.8 0.00% -15.88%

6 -1078921.8 146.3 -1078922.5 132.8 0.00% -9.20%

7 -1078746.5 148.3 -1078752.1 122.7 0.00% -17.25%

8 -1078556.8 148.9 -1078567.7 130.3 0.00% -12.46%

9 -1078373.4 152.7 -1078377.9 126.3 0.00% -17.29%

10 -1078164.6 153.0 -1078168.0 125.1 0.00% -18.22%

11 -1078534.8 146.3 -1078540.2 125.5 0.00% -14.23%

12 -1078342.2 145.7 -1078342.2 128.4 0.00% -11.87%

13 -1078176.5 144.7 -1078186.7 120.5 0.00% -16.68%

14 -1077966.7 144.7 -1077972.3 126.5 0.00% -12.57%

15 -1077819.7 148.1 -1077823.0 113.6 0.00% -23.32%

Aver. -1078560.5 149.2 -1078565.0 134.4 0.00% -9.96%


N=100,LR method

(iter# = 200)Tabu method

(L=5, Iter#=10, M=50)(UBTabu − UBLR) (TTabu − TLR)

K=15,S Feasi. UB Sol time (s) Feasi. UB Sol time (s) /|UBLR| /|TLR|1 -1154229.4 432.7 -1207185.2 270.9 -4.59% -37.39%

2 -1153886.7 431.0 -1206809.1 239.5 -4.59% -44.42%

3 -1153559.3 429.2 -1206433.1 180.4 -4.58% -57.96%

4 -1153204.2 428.2 -1206022.1 266.1 -4.58% -37.86%

5 -1152848.1 425.6 -1205648.3 226.9 -4.58% -46.68%

6 -1153777.1 431.8 -1206536.5 229.1 -4.57% -46.93%

7 -1153433.0 431.0 -1195391.3 212.2 -3.64% -50.78%

8 -1153078.3 425.9 -1205767.3 227.8 -4.57% -46.51%

9 -1152747.7 425.1 -1218523.2 223.0 -5.71% -47.54%

10 -1152445.9 428.4 -1233935.7 219.6 -7.07% -48.73%

11 -1153371.1 417.9 -1179176.2 177.4 -2.24% -57.54%

12 -1152993.8 416.3 -1234457.1 216.8 -7.07% -47.92%

13 -1152655.1 417.6 -1194264.8 177.0 -3.61% -57.62%

14 -1152294.3 433.0 -1233744.8 201.5 -7.07% -53.47%

15 -1151902.9 432.8 -1217373.7 264.8 -5.68% -38.83%

Aver. -1153095.1 427.1 -1210084.6 222.2 -4.94% -47.97%



N=100,LR method



K=20,S Feasi. UB Sol time (s) Feasi. UB Sol time (s) /|UBLR| /|TLR|1 -1166811.7 457.8 -1166811.7 311.5 0.00% -31.96%

2 -1166460.9 455.6 -1175693.8 353.3 -0.79% -22.45%

3 -1166161.5 456.5 -1191159.0 322.0 -2.14% -29.48%

4 -1165851.5 453.1 -1167658.8 369.2 -0.16% -18.52%

5 -1165522.9 454.8 -1165522.9 342.5 0.00% -24.69%

6 -1167489.0 462.4 -1167489.0 335.0 0.00% -27.56%

7 -1167162.0 459.1 -1176377.1 364.7 -0.79% -20.56%

8 -1166876.7 460.0 -1166876.7 359.8 0.00% -21.79%

9 -1166547.8 458.3 -1199903.3 351.5 -2.86% -23.30%

10 -1166235.2 456.1 -1257622.4 350.4 -7.84% -23.17%

11 -1168203.4 455.1 -1262505.5 383.2 -8.07% -15.81%

12 -1167897.8 458.6 -1245929.9 354.9 -6.68% -22.61%

13 -1167565.8 468.8 -1314003.1 432.4 -12.54% -7.76%

14 -1167197.0 468.6 -1313772.9 427.5 -12.56% -8.78%

15 -1166734.5 468.1 -1313459.0 375.1 -12.58% -19.87%

Aver. -1166847.9 459.5 -1218985.7 362.2 -4.47% -21.18%


N=100,LR method



K=25,S Feasi. UB Sol time (s) Feasi. UB Sol time (s) /|UBLR| /|TLR|1 -1044940.5 530.3 -1114276.1 436.6 -6.64% -17.66%

2 -987616.6 532.2 -1113728.8 440.1 -12.77% -17.31%

3 -1083273.8 545.4 -1113107.7 464.6 -2.75% -14.81%

4 -1026215.3 552.5 -1112551.1 452.2 -8.41% -18.15%

5 -985830.0 562.4 -1111942.8 526.2 -12.79% -6.43%

6 -989603.4 527.7 -1114929.3 543.9 -12.66% 3.07%

7 -988337.4 526.3 -1114344.3 536.6 -12.75% 1.95%

8 -1007759.6 533.9 -1099134.2 607.6 -9.07% 13.80%

9 -987148.1 543.3 -1145029.8 619.3 -15.99% 13.99%

10 -986505.8 556.3 -1125691.2 616.9 -14.11% 10.89%

11 -1011520.9 531.0 -1115614.3 516.0 -10.29% -2.81%

12 -1011021.7 533.5 -1211471.3 536.0 -19.83% 0.49%

13 -1010573.4 532.6 -1196751.4 587.3 -18.42% 10.28%

14 -1164298.1 532.3 -1160209.1 695.6 0.35% 30.67%

15 -1160491.6 532.6 -1159798.3 652.8 0.06% 22.57%

Aver. -1029675.7 538.1 -1133905.3 548.8 -10.41% 1.98%



N=100,LR method



K=30,S Feasi. UB Sol time (s) Feasi. UB Sol time (s) / |UBLR| / |TLR|1 -1106244.0 673.6 -1219870.7 3091.5 -10.27% 358.94%

2 -1124431.6 673.5 -1224305.5 3149.2 -8.88% 367.59%

3 -1172567.9 674.4 -1225166.8 3252.2 -4.49% 382.25%

4 -1090292.0 668.5 -1230126.3 3722.2 -12.83% 456.84%

5 -1122764.0 663.6 -1221349.4 3687.4 -8.78% 455.63%

6 -1058060.4 633.1 -1217562.6 3028.0 -15.07% 378.28%

7 -1101609.1 648.2 -1221996.8 3065.4 -10.93% 372.93%

8 -1057450.3 665.2 -1253857.4 3056.0 -18.57% 359.43%

9 -1131543.2 670.2 -1281896.8 2934.8 -13.29% 337.90%

10 -1185779.7 668.6 -1258063.3 3524.9 -6.10% 427.18%

11 -1136963.0 636.0 -1277950.2 2500.9 -12.40% 293.22%

12 -1186789.6 1177.3 -1307492.5 2455.8 -10.17% 108.59%

13 -1163930.7 1197.6 -1299401.6 2511.4 -11.64% 109.70%

14 -1167531.0 1206.9 -1316496.7 2485.7 -12.76% 105.96%

15 -1149820.1 2050.9 -1316631.6 2382.2 -14.51% 16.15%

Aver. -1130385.1 860.5 -1258144.6 2989.8 -11.38% 247.45%

Numerical result show that the cardinality parameter k can affect the running time of the Tabu

method. The searching time will keep increasing until k over N2 , since we swap the assets between set

K and S, in worse cast the number of combination equals KS = K (N −K) = −(K − N

2

)2+ N2

4 . In

the Tabu search method, the neighbor points increase linearly associated to k, this is why the running

time is longer than LR in Table B.9 and B.10. However, the Tabu search can get a better solution than

LR in most of cases.

Appendix C


C. 1 Parameter generation for the robust tracking model

We applied the same procedure described in [66] to three-factor model for constructing factor-based

robust index tracking models. We follow Goldfarb and Iyengar [66] closely. Suppose the return vector r

is given by the linear regression model:

r = µ+ V T f + ε (C.1)

where µ ∈ Rn is the vector of mean returns, f ∼ N (0, F ) ∈ Rm is the vector of returns of the factors

that drive the market, V ∈ Rm×n is the matrix of factor loadings of the n assets, and ε ∼ N (0, D) is

the vector of residual returns.

Let S =[r1, r2, · · · , rp

]∈ Rn×p be the matrix of asset returns and B =

[f1, f2, · · · , fp

]∈ Rm×p be

the matrix of factor returns, then (C.1) which can be represented by the following linear model:

yi = Axi + εi,∀i = 1, · · · , n

where yi =[r1i , r

2i , · · · , r

pi

]T, A =

[1, BT

], xi = [µi, V1i, V2i, · · · , Vmi]T and εi =

[e1i , e

2i , · · · , e

pi

]T.

As we shown in section 6.5.1, for single factor model, we set B =[f1, f2

]= [rM , rf ]

T; for three

factor model, B =[f1, f2, f3, f4

]= [rM , rf , SMB,HML]

T. The least-squares estimate xi of the true

parameter xi is given by

xi =(ATA

)−1AT yi,∀i = 1, · · · , n (C.2)

Substituting yi = Axi+εi into (C.2), we get xi−xi =(ATA

)−1AT εi ∼ N (0,Σ) where Σ = σ2

i

(ATA

)−1.

σ2i is unknown in practice, so we replace σ2

i by (m+ 1) s2i where s2i is the unbiased estimate of σ2i . σ2

i is

given by

s2i =‖yi −Axi‖p−m− 1

(C.3)

and the resulting variable

Y =1

(m+ 1) s2i(xi − xi)T

(ATA

)(xi − xi) (C.4)

is a F-distribution with (m+ 1) degrees of freedom in the numerator and (p−m− 1) degrees of freedom

174

Appendix C. Appendix of Chapter 6 175

in the denominator [66].

By setting the joint confidence region ω for set (µ, V ), Goldfarb and Iyengar [66] derive the following

result for the parameters that can be used in our robust model:

µ0,i = µi, γi =

√(ATA)

−111 c1 (ω) s2i , i = 1, · · · , n (C.5)

V0 = V ,G =(Q(ATA

)−1QT)−1

, ρi =√mcm (ω) s2i , i = 1, · · · , n (C.6)

where cJ (ω) be the ω-critical value. More prove details read in [66]. Then a worst case bound for the

covariance matrix is achieved by 3 factor model, i.e. cov = V T0 FV0 + D, where D = diag(s2i). The

uncertainty set for µ in (C.5) will be used in for robust portfolio returns.

C. 2 LR gap information (S&P500)

Table C.1: Bounds information (SP500)

qGurobi Obj


Gap by

LR

Gap to

Gurobi

Time

by LR

20 0.00789386 0.00766973 0.00789601 0.03% 2.87% 1206.31

25 0.00780381 0.00760901 0.00782685 0.30% 2.78% 1790.17

30 0.00775582 0.00760910 0.00777614 0.26% 2.15% 1828.48

35 0.00771529 0.00760903 0.00771885 0.05% 1.42% 1906.24

40 0.00767989 0.00760811 0.00768101 0.01% 0.95% 2103.63

45 0.00766137 0.00765416 0.00766317 0.02% 0.12% 661.12

50 0.00764986 0.00764647 0.00765126 0.02% 0.06% 203.49

55 0.00764921 0.00761658 0.00765333 0.05% 0.48% 2383.81

60 0.00764508 0.00764096 0.00764556 0.01% 0.06% 2163.47

65 0.00764522 0.00764334 0.00764666 0.02% 0.04% 2241.48

70 0.00764938 0.00764359 0.00764980 0.01% 0.08% 3332.34

75 0.00765234 0.00764482 0.00765291 0.01% 0.11% 2009.29

80 0.00765870 0.00764463 0.00766008 0.02% 0.20% 1571.46

85 0.00766614 0.00762793 0.00766623 0.00% 0.50% 1539.80

90∗ 0.00767435 0.00763362 0.00767417 0.00% 0.53% 1625.92

95 0.00768440 0.00765516 0.00779374 1.42% 1.78% 1345.64

100 0.00769624 0.00764036 0.00771742 0.28% 1.00% 1244.59

105 0.00771118 0.00764954 0.00782274 1.45% 2.21% 1310.59

110 0.00772767 0.00763821 0.00773064 0.04% 1.20% 968.22

115 0.00774483 0.00764866 0.00775279 0.10% 1.34% 1244.13

120 0.00776491 0.00765592 0.00776838 0.04% 1.45% 1163.76

125∗ 0.00778909 0.00764436 0.00778888 0.00% 1.86% 834.41

Continued on next page

Appendix C. Appendix of Chapter 6 176

Table C.1 – continued from previous page

qGurobi Obj


Gap by

LR

Gap to

Gurobi

Time

by LR

130 0.00780908 0.00764376 0.00784821 0.50% 2.61% 961.57

135 0.00783417 0.00764615 0.00783452 0.00% 2.40% 817.06

140 0.00786138 0.00764598 0.00797931 1.50% 4.18% 1086.52

145 0.00789183 0.00764809 0.00789229 0.01% 3.09% 929.20

150 0.00792418 0.00765731 0.00792472 0.01% 3.37% 1174.88

155 0.00795668 0.00764932 0.00795717 0.01% 3.87% 1676.62

160 0.00799044 0.00765544 0.00799137 0.01% 4.20% 1656.96

165 0.00802677 0.00764873 0.00802715 0.00% 4.71% 1369.31

170 0.00806122 0.00765280 0.00807049 0.12% 5.18% 1673.08

175 0.00810111 0.00764825 0.00812490 0.29% 5.87% 1797.62

180 0.00813905 0.00764942 0.00813939 0.00% 6.02% 1773.28

185 0.00818230 0.00764982 0.00818237 0.00% 6.51% 1692.60

190 0.00822661 0.00765542 0.00823788 0.14% 7.07% 1569.80

195 0.00827418 0.00765011 0.00827433 0.00% 7.54% 1550.82

200 0.00831963 0.00764642 0.00831977 0.00% 8.09% 1197.74

205 0.00837033 0.00828720 0.00837096 0.01% 1.00% 1268.42

210 0.00842335 0.00765141 0.00844361 0.24% 9.38% 1178.02

215∗ 0.00847721 0.00765117 0.00847717 0.00% 9.74% 1239.34

220 0.00853442 0.00765120 0.00853405 0.00% 10.35% 1333.65

225 0.00859324 0.00765139 0.00859357 0.00% 10.96% 1323.17

230 0.00865333 0.00767544 0.00865338 0.00% 11.30% 1300.40

235 0.00871537 0.00780126 0.00871760 0.03% 10.51% 1365.84

240∗ 0.00878201 0.00814516 0.00878189 0.00% 7.25% 1113.00

245 0.00884693 0.00768989 0.00884707 0.00% 13.08% 1321.80

250 0.00892059 0.00780564 0.00892467 0.05% 12.54% 1396.12

255 0.00898990 0.00893445 0.00934490 3.95% 4.39% 1414.14

260 0.00906451 0.00816790 0.00906472 0.00% 9.89% 1313.62

265 0.00914025 0.00876139 0.00914028 0.00% 4.15% 622.40

270 0.00921670 0.00893782 0.00921799 0.01% 3.04% 672.30

275∗ 0.00929877 0.00911401 0.00929876 0.00% 1.99% 655.86

280 0.00937983 0.00920692 0.00940317 0.25% 2.09% 996.06

285 0.00946307 0.00911212 0.00951389 0.54% 4.22% 1460.69

290 0.00954848 0.00896858 0.00955468 0.06% 6.13% 1923.43

295 0.00963897 0.00921027 0.00964521 0.06% 4.51% 1909.74

300 0.00972930 0.00945807 0.00972966 0.00% 2.79% 1865.29

Average / / / 0.21% 4.16% 1425.94

LAGRANGIAN RELAXATION APPROACHES TO ......Portfolio selection with cardinality constraint is a...

Documents

Transcript of LAGRANGIAN RELAXATION APPROACHES TO ......Portfolio selection with cardinality constraint is a...